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THERAPEUTIC COMPOSITIONS 

RELATED APPLICATIONS 

The present application claims the benefit of Application No. 60/452,682, filed 
5 March 7, 2003; Application No. 60/462,894, filed April 14, 2003; and Application 

No. 60/465,665, filed April 25, 2003; Application No. 60/463,772, filed April 17, 2003; 
Application No. 60/465,802, filed April 25, 2003; Application No. 60/493,986, filed 
August 8, 2003; Application No. 60/494,597, filed August 1 1, 2003; Application No. 
60/506,341, filed September 26, 2003; Application No. 60/518,453, filed November 7, 2003; 
1 0 Application No. 60/454,265, filed March 12, 2003; Application No. 60/454,962, filed March 
13, 2003; Application No. 60/455,050, filed March 13, 2003; Application No. 60/469,612, 
filed May 9, 2003; Application No. 60/510,246, filed October 9, 2003; Application 
No. 60/510,318, filed October 10, 2003. The contents of these provisional applications are 
hereby incorporated by reference in their entirety. 

15 

TECHNICAL FIELD 

The invention relates to RNAi and related methods, e.g., methods of making and 
using iRNA agents. 

BACKGROUND 

20 RNA interference or "RNAi" is a term initially coined by Fire and co-workers to 

describe the observation that double-stranded RNA (dsRNA) can block gene expression 
when it is introduced into worms (Fire et al. (1998) Nature 391, 806-81 1). Short dsRNA 
directs gene-specific, post-transcriptional silencing in many organisms, including vertebrates, 
and has provided a new tool for studying gene function. RNAi may involve mRNA 

25 degradation. 
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SUMMARY 

A number of advances related to the application of RNAi to the treatment of subjects 
are disclosed herein. For example, the invention features iRNA agents targeted to specific 
genes; palindromic iRNA agents; iRNA agents having non canonical monomer pairings; 
iRNA agents having particular structures or architectures e.g., the Z-X-Y or asymmetrical 
iRNA agents described herein; drug delivery conjugates for the delivery of iRNA agents; 
amphipathic substances for the delivery of iRNA agents, as well as iRNA agents having 
chemical modifications for optimizing a property of the iRNA agent. The invention features 
each of these advances broadly as well as in combinations. For example, an iRNA agent 
targeted to a specific gene can also include one or more of a palindrome, non canonical, Z-X- 
Y, or asymmetric structure. Other nonlimiting examples of combinations include an 
asymmetric structure combined with a chemical modification, or formulations or methods or 
routes of delivery combined with, e.g., chemical modifications or architectures described 
herein. The iRNA agents of the invention can include any one of these advances, or pairwise 
or higher order combinations of the separate advances. 

In one aspect, the invention features iRNA agents that can target more than one RNA 
region, and methods of using and making the iRNA agents. 

In another aspect, an iRNA agent includes a first and second sequence that are 
sufficiently complementary to each other to hybridize. The first sequence can be 
complementary to a first target RNA region and the second sequence can be complementary 
to a second target RNA region. 

In one embodiment, the first and second sequences of the iRNA agent are on different 
RNA strands, and the mismatch between the first and second sequences is less than 50%, 
40%, 30%, 20%, 10%, 5%, or 1%. 

In another embodiment, the first and second sequences of the iRNA agent are on the 
same RNA strand, and in a related embodiment more than 50%, 60%, 70%, 80%, 90%, 95%, 
or 1% of the iRNA agent is in bimolecular form. 

In another embodiment, the first and second sequences of the iRNA agent are fully 
complementary to each other. 

In one embodiment, the first target RNA region is encoded by a first gene and the 
second target RNA region is encoded by a second gene, and in another embodiment, the first 
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and second target RNA regions are different regions of an RNA from a single gene. In 
another embodiment, the first and second sequences differ by at least 1 and no more than 6 
nucleotides. 

In certain embodiments, the first and second target RNA regions are on transcripts 
5 encoded by first and second sequence variants, e.g., first and second alleles, of a gene. The 
sequence variants can be mutations, or polymorphisms, for example. 

In certain embodiments, the first target RNA region includes a nucleotide 
substitution, insertion, or deletion relative to the second target RNA region. 

In other embodiments, the second target RNA region is a mutant or variant of the first 
1 o target RNA region. 

In certain embodiments, the first and second target RNA regions comprise viral, e.g., 
HCV, or human RNA regions. The first and second target RNA regions can also be on 
variant transcripts of an oncogene or include different mutations of a tumor suppressor gene 
transcript. In one embodiment, the oncogene, or tumor suppressor gene is expressed in the 
1 5 liver. In addition, the first and second target RNA regions correspond to hot-spots for 
genetic variation. 

In another aspect, the invention features a mixture of varied iRNA agent molecules, 
including one iRNA agent that includes a first sequence and a second sequence sufficiently 
complementary to each other to hybridize, and where the first sequence is complementary to 

20 a first target RNA region and the second sequence is complementary to a second target RNA 
region. The mixture also includes at least one additional iRNA agent variety that includes a 
third sequence and a fourth sequence sufficiently complementary to each other to hybridize, 
and where the third sequence is complementary to a third target RNA region and the fourth 
sequence is complementary to a fourth target RNA region. In addition, the first or second 

25 sequence is sufficiently complementary to the third or fourth sequence to be capable of 

hybridizing to each other. In one embodiment, at least one, two, three or all four of the target 
RNA regions are expressed in the liver. Exemplary RNAs are transcribed from the apoB-100 
gene, glucose-6-phosphatase gene, beta catenin gene, or an HCV gene. 

In certain embodiments, the first and second sequences are on the same or different 

30 RNA strands, and the third and fourth sequences are on same or different RNA strands. 
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In one embodiment, the mixture further includes a third iRNA agent that is composed 
of the first or second sequence and the third or fourth sequence. 

In one embodiment, the first sequence is identical to at least one of the second, third 
and fourth sequences, and in another embodiment, the first region differs by at least 1 but no 
more than 6 nucleotides from at least one of the second, third and fourth regions. 

In certain embodiments, the first target RNA region comprises a nucleotide 
substitution, insertion, or deletion relative to the second, third or fourth target RNA region. 

The target RNA regions can be variant sequences of a viral or human RNA, and in 
certain embodiments, at least two of the target RNA regions can be on variant transcripts of 
an oncogene or tumor suppressor gene. In one embodiment, the oncogene or tumor 
suppressor gene is expressed in the liver. 

In certain embodiments, at least two of the target RNA regions correspond to hot- 
spots for genetic variation. 

In one embodiment, the iRNA agents of the invention are formulated for 
pharmaceutical use. In one aspect, the invention provides a container (e.g., a vial, syringe, 
nebulizer, etc) to hold the iRNA agents described herein. 

Another aspect of the invention features a method of making an iRNA agent. The 
method includes constructing an iRNA agent that has a first sequence complementary to a 
first target RNA region, and a second sequence complementary to a second target RNA 
region. The first and second target RNA regions have been identified as being sufficiently 
complementary to each other to be capable of hybridizing. In one embodiment, the first and 
second target RNA regions are on transcripts expressed in the liver. 

In certain embodiments, the first and second target RNA regions can correspond to 
two different regions encoded by one gene, or to regions encoded by two different genes. 

Another aspect of the invention features a method of making an iRNA agent 
composition. The method includes obtaining or providing information about a region of an 
RNA of a target gene (e.g., a viral or human gene, or an oncogene or tumor suppressor, e.g., 
p53), where the region has high variability or mutational frequency (e.g., in humans). In 
addition, information about a plurality of RNA targets within the region is obtained or 
provided, where each RNA target corresponds to a different variant or mutant of the gene 
(e.g., a region including the codon encoding p53 248Q and/or p53 249S). The iRNA agent is 
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constructed such that a first sequence is complementary to a first of the plurality of variant 
RNA targets {e.g., encoding 249Q) and a second sequence is complementary to a second of 
the plurality of variant RNA targets (e.g., encoding 249S). The first and second sequences 
are sufficiently complementary to hybridize. In certain embodiments, the target gene can be 
a viral or human gene expressed in the liver. 

In one embodiment, sequence analysis, e.g., to identify common mutants in the target 
gene, is used to identify a region of the target gene that has high variability or mutational 
frequency. For example, sequence analysis can be used to identify regions of apoB-100 or 
beta catenin that have high variability or mutational frequency. In another embodiment, the 
region of the target gene having high variability or mutational frequency is identified by 
obtaining or providing genotype information about the target gene from a population. In 
another embodiment, the genotype information can be from a population suffering from a 
liver disorder, such as hepatocellular carcinoma or hepatoblastoma. 

Another aspect of the invention features a method of modulating expression, e.g., 
downregulating or silencing, a target gene, by providing an iRNA agent that has a first 
sequence and a second sequence sufficiently complementary to each other to hybridize. In 
addition, the first sequence is complementary to a first target RNA region and the second 
sequence is complementary to a second target RNA region. 

In one embodiment, the iRNA agent is administered to a subject, e.g., a human. 
In another embodiment, the first and second sequences are between 15 and 30 
nucleotides in length. 

In one embodiment, the method of modulating expression of the target gene further 
includes providing a second iRNA agent that has a third sequence complementary to a third 
target RNA region. The third sequence can be sufficiently complementary to the first or 
second sequence to be capable of hybridizing to either the first or second sequence. 

Another aspect of the invention features a method of modulating expression, e.g., 
downregulating or silencing, a plurality of target RNAs, each of the plurality of target RNAs 
corresponding to a different target gene. The method includes providing an iRNA agent 
selected by identifying a first region in a first target RNA of the plurality and a second region 
in a second target RNA of the plurality, where the first and second regions are sufficiently 
complementary to each other to be capable of hybridizing. 
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In another aspect of the invention, an iKNA agent molecule includes a first sequence 
complementary to a first variant RNA target region and a second sequence complementary to 
a second variant RNA target region, and the first and second variant RNA target regions 
correspond to first and second variants or mutants of a target gene. In certain embodiments, 
the target gene is an apoB-100, beta catenin, or glucose-6 phosphatase gene. 

In one embodiment, the target gene is a viral gene {e.g., an HCV gene), tumor 
suppressor or oncogene. 

In another embodiment, the first and second variant target RNA regions include 
allelic variants of the target gene. 

In another embodiment, the first and second variant RNA target regions comprise 
mutations {e.g., point mutations) or polymorphisms of the target gene. 

In one embodiment, the first and second variant RNA target regions correspond to 
hot-spots for genetic variation. 

Another aspect of the invention features a plurality {e.g., a panel or bank) of iRNA 
agents. Each of the iRNA agents of the plurality includes a first sequence complementary to 
a first variant target RNA region and a second sequence complementary to a second variant 
target RNA region, where the first and second variant target RNA regions correspond to first 
and second variants of a target gene. In certain embodiments, the variants are allelic variants 
of the target gene. 

Another aspect of the invention provides a method of identifying an iRNA agent for 
treating a subject. The method includes providing or obtaining information, e.g., a genotype, 
about a target gene, providing or obtaining information about a plurality {e.g., panel or bank) 
of iRNA agents, comparing the information about the target gene to information about the 
plurality of iRNA agents, and selecting one or more of the plurality of iRNA agents for 

, treating the subject. Each of the plurality of iRNA agents includes a first sequence 

complementary to a first variant target RNA region and a second sequence complementary to 
a second variant target RNA region, and the first and second variant target RNA regions 
correspond to first and second variants of the target gene. The target gene can be an 
endogenous gene of the subject or a viral gene. The information about the plurality of iRNA 

3 agents can be the sequence of the first or second sequence of one or more of the plurality. 
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In certain embodiments, at least one of the selected iKNA agents includes a sequence 
capable of hybridizing to an RNA region corresponding to the target gene, and at least one of 
the selected iKNA agents comprises a sequence capable of hybridizing to an RNA region 
corresponding to a variant or mutant of the target gene. 
5 In one aspect, the invention relates to compositions and methods for silencing genes 

expressed in the liver, e.g., to treat disorders of or related to the liver. An iRNA agent 
composition of the invention can be one which has been modified to alter distribution in 
favor of the liver. 

In another aspect, the invention relates to iRNA agents that can target more than one 
10 RNA region, and methods of using and making the iRNA agents. In one embodiment, the 
RNA is from a gene that is active in the liver, e.g., apoR-100, glucose-6-phosphatase, beta- 
catenin, or Hepatitis C virus (HCV). 

In another aspect, an iRNA agent includes a first and second sequence that are 
sufficiently complementary to each other to hybridize. The first sequence can be 
1 5 complementary to a first target RNA region and the second sequence can be complementary 
to a second target RNA region. For example, the first sequence can be complementary to a 
first target apoB-100 RNA region and the second sequence can be complementary to a 
second target apoB-100 RNA region. 

In one embodiment, the first target RNA region is encoded by a first gene, e.g., a 
20 gene expressed in the liver, and the second target RNA region is encoded by a second gene, 
e.g., a second gene expressed in the liver. In another embodiment, the first and second target 
RNA regions are different regions of an RNA from a single gene, e.g., a single gene that is at 
least expressed in the liver. In another embodiment, the first and second sequences differ by 
at least one and no more than six nucleotides. 
25 In another embodiment, sequence analysis, e.g., to identify common mutants in the 

target gene, is used to identify a region of the target gene that has high variability or 
mutational frequency. For example, sequence analysis can be used to identify regions of 
aopB-100 or beta catenin that have high variability or mutational frequency. In another 
embodiment, the region of the target gene having high variability or mutational frequency is 
30 identified by obtaining or providing genotype information about the target gene from a 
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population. In particular, the genotype information can be from a population suffering from 
a liver disorder, such as hepatocellular carcinoma or hepatoblastoma. 

In another aspect, the invention features a method for reducing apoB-100 levels in a 
subject, e.g., a mammal, such as a human. The method includes administering to a subject an 
iRNA agent which targets apoB-1 00. The iRNA agent can be one described here, and can be 
a dsRNA that has a sequence that is substantially identical to a sequence of the apoB-100 
gene. The iRNA can be less than 30 nucleotides in length, e.g., 21-23 nucleotides. 
Preferably, the iRNA is 21 nucleotides in length. In one embodiment, the iRNA is 21 
nucleotides in length, and the duplex region of the iRNA is 19 nucleotides. In another 
embodiment, the iRNA is greater than 30 nucleotides in length. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences listed in Tables 5 and 6. In a preferred embodiment it targets both 
sequences of a palindromic pair provided in Tables 5 and 6. The most preferred targets are 
listed in descending order of preferrability, in other words, the more preferred targets are 
listed earlier in Tables 5 and 6. 

In a preferred embodiment the iRNA agent will include regions, or strands, which are 
complementary to a pair in Tables 5 and 6. In a preferred embodiment the iRNA agent will 
include regions complementary to the palindromic pairs of Tables 5 and 6 as a duplex region. 

In a preferred embodiment the duplex region of the iRNA agent will target a sequence 
listed in Tables 5 and 6 but will not be perfectly complementary with the target sequence, 
e.g., it will not be complementary at at least 1 base pah. Preferably it will have no more than 
1, 2, 3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere herein but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a hairpin, or by other non-base linkers. 

The iRNA agent that targets apoB-100 can be administered in an amount sufficient to 
reduce expression of apoB-100 mRNA. In one embodiment, the iRNA agent is administered 
in an amount sufficient to reduce expression of apoB-100 protein (e.g., by at least 2%, 4%, 
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6%, 10%, 15%, 20%). Preferably, the iRNA agent does not reduce expression of apoB-48 
mRNA or protein. This can be effected, e.g., by selection of an iRNA agent which 
specifically targets the nucleotides subject to RNA editing in the apoB-100 transcript. 

The iRNA agent that targets apoB-100 can be administered to a subject, wherein the 
subject is suffering from a disorder characterized by elevated or otherwise unwanted 
expression of apoB-100, elevated or otherwise unwanted levels of cholesterol, and/or 
disregulation of lipid metabolism. The iRNA agent can be administered to an individual at 
risk for the disorder to delay onset of the disorder or a symptom of the disorder. These 
disorders include HDL/LDL cholesterol imbalance; dyslipidemias, e.g., familial combined 
hyperlipidemia (FCHL), acquired hyperlipidemia; hypercholesterolemia; statin-resistant 
hypercholesterolemia; coronary artery disease (CAD) coronary heart disease (CHD) 
atherosclerosis. In one embodiment, the iRNA that targets apoB-100 is administered to a 
subject suffering from statin-resistant hypercholesterolemia. 

The apoB-100 iRNA agent can be administered in an amount sufficient to reduce 
levels of serum LDL-C and/or HDL-C and/or total cholesterol in a subject. For example, the 
iRNA is administered in an amount sufficient to decrease total cholesterol by at least 0.5%, 
1%, 2.5%, 5%, 1 0% in the subject. In one embodiment, the iRNA agent is administered in 
an amount sufficient to reduce the risk of myocardial infarction the subject. 

In a preferred embodiment the iRNA agent is administered repeatedly. 
Administration of an iRNA agent can be carried out over a range of time periods. It can be 
administered daily, once every few days, weekly, or monthly. The timing of administration 
can vary from patient to patient, depending on such factors as the severity of a patient's 
symptoms. For example, an effective dose of an iRNA agent can be administered to a patient 
once a month for an indefinite period of time, or until the patient no longer requires therapy. 
5 In addition, sustained release compositions containing an iRNA agent can be used to 
maintain a relatively constant dosage in the patient's blood. 

In one embodiment, the iRNA agent can be targeted to the liver, and apoB expression 
level are decreased in the liver following administration of the apoB iRNA agent. For 
example, the iRNA agent can be complexed with a moiety that targets the liver, e.g., an 
o antibody or ligand that binds a receptor on the liver. 
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The iRNA agent, particularly an iRNA agent that targets apoB, beta-catenin or 
glucose-6-phosphatase RNA, can be targeted to the liver, for example by associating, e.g., 
conjugating the iRNA agent to a lipophilic moiety, e.g., a lipid, cholesterol, oleyl, retinyl, or 
cholesteryl residue (see Table 1). Other lipophilic moieties that can be associated, e.g., 
conjugated with the iRNA agent include cholic acid, adamantane acetic acid, 1-pyrene 
butyric acid, dihydrotestosterone, l,3-Bis-0(hexadecyl)glycerol, geranyloxyhexyl group, 
hexadecylglycerol, borneol, menthol, 1,3-propanediol, heptadecyl group, palmitic acid, 
myristic acid,03-(oleoyl)lithocholic acid, 03-(oleoyl)cholenic acid, dimethoxytrityl, or 
phenoxazine. In one embodiment, the iRNA agent can be targeted to the liver by associating, 
e.g., conjugating, the iRNA agent to a low-density lipoprotein (LDL), e.g., a lactosylated 
LDL. In another embodiment, the iRNA agent can be targeted to the liver by associating, 
e.g., conjugating, the iRNA agent to a polymeric carrier complex with sugar residues. 

In another embodiment, the iRNA agent can be targeted to the liver by associating, 
e.g., conjugating, the iRNA agent to a liposome complexed with sugar residues. A targeting 
agent that incorporates a sugar, e.g., galactose and/or analogues thereof, is particularly useful. 
These agents target, in particular, the parenchymal cells of the liver (see Table 1). In a 
preferred embodiment, the targeting moiety includes more than one galactose moiety, 
preferably two or three. Preferably, the targeting moiety includes 3 galactose moieties, e.g., 
spaced about 15 angstroms from each other. The targeting moiety can be lactose. A lactose 
is a glucose coupled to a galactose. Preferably, the targeting moiety includes three lactoses. 
The targeting moiety can also be N-Acetyl-Galactosamine, N-Ac-Glucosamine. A mannose, 
or mannose-6-phosphate targeting moiety can be used for macrophage targeting. 

The targeting agent can be linked directly, e.g., covalently or non covalently, to the 
iRNA agent, or to another delivery or formulation modality, e.g., a liposome. E.g., the iRNA 
agents with or without a targeting moiety can be incorporated into a delivery modality, e.g., a 
liposome, with or without a targeting moiety. 

It is particularly preferred to use an iRNA conjugated to a lipophilic molecule to 
conjugate to an iRNA agent that targets apoB, beta-catenin or glucose-6-phosphatase iRNA 
targeting agent. 

In one embodiment, the iRNA agent has been modified, or is associated with a 
delivery agent, e.g., a delivery agent described herein, e.g., a liposome, which has been 

10 
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modified to alter distribution in favor of the liver. In one embodiment, the modification 
mediates association with a serum albumin (SA), e.g., a human serum albumin (EISA), or a 
fragment thereof. 

The iRNA agent, particularly an iRNA agent that targets apoB, beta-catenin or 
glucose-6-phosphatase RNA, can be targeted to the liver, for example by associating, e.g., 
conjugating the iRNA agent to an SA molecule, e.g., an HSA molecule, or a fragment 
thereof. In one embodiment, the iRNA agent or composition thereof has an affinity for an 
SA, e.g., HSA, which is sufficiently high such that its levels in the liver are at least 10, 20, 
30, 50, or 100% greater in the presence of SA, e.g., HSA, or is such that addition of 
exogenous SA will increase delivery to the liver. These criteria can be measured, e.g., by 
testing distribution in a mouse in the presence or absence of exogenous mouse or human SA. 

The SA, e.g., HSA, targeting agent can be linked directly, e.g., covalently or non- 
covalently, to the iRNA agent, or to another delivery or formulation modality, e.g., a 
liposome. E.g., the iRNA agents with or without a targeting moiety can be incorporated into 
a delivery modality, e.g., a liposome, with or without a targeting moiety. 

It is particularly preferred to use an iRNA conjugated to an SA, e.g., an HSA, 
molecule wherein the iRNA agent is an apoB, beta-catenin or glucose-6-phosphatase iRNA 
targeting agent. 

In another aspect, the invention features, a method for reducing glucose-6- 
phosphatase levels in a subject, e.g., a mammal, such as a human. The method includes 
administering to a subject an iRNA agent which targets glucose-6-phosphatase. The iRNA 
agent can be a dsRNA that has a sequence that is substantially identical to a sequence of the 
glucose-6-phosphatase gene. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences listed in Table 7. In a preferred embodiment it targets both sequences 
of a palindromic pair provided in Table 7. The most preferred targets are listed in 
descending order of preferrability, in other words, the more preferred targets are listed earlier 
in Table 7. 

In a preferred embodiment the iRNA agent will include regions, or strands, which are 
complementary to a pair in Table 7. In a preferred embodiment the iRNA agent will include 
regions complementary to the palindromic pairs of Table 7 as a duplex region. 
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In a preferred embodiment the duplex region of the iRNA agent will target a sequence 
listed in Table 7_but will not be perfectly complementary with the target sequence, e.g., it 
will not be complementary at at least 1 base pair. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere herein but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a hairpin, or by other non-base linkers. 

Table 7 refers to sequences from human glucose-6-phosphatase. Table 8 refers to 
sequences from rat glucose-6-phosphatase. The sequences from table 8 can be used, e.g., in 
experiments with rats or cultured rat cells. 

In a preferred embodiment iRNA agent can have any architecture, e.g., architecture 
described herein. E.g., it can be incorporated into an iRNA agent having an overhang 
structure, overall length, hairpin vs. two-strand structure, as described herein. In addition, 
monomers other than naturally occurring ribonucleotides can be used in the selected iRNA 
agent. 

The iRNA that targets glucose-6-phosphatase can be administered in an amount 
sufficient to reduce expression of glucose-6-phosphatase mRNA. 

The iRNA that targets glucose-6-phosphatase can be administered to a subject to 
inhibit hepatic glucose production, for the treatment of glucose-metabolism-related disorders, 
such as diabetes, e.g., type-2-diabetes mellitus. The iRNA agent can be administered to an 
individual at risk for the disorder to delay onset of the disorder or a symptom of the disorder. 

In other embodiments, iRNA agents having sequence similarity to the following 
genes can also be used to inhibit hepatic glucose production. These other genes include 
"forkhead homologue in rhabdomyosarcoma (FKHR); glucagon; glucagon receptor; 
glycogen phosphorylase; PPAR-Gamma Coactivator (PGC-1); Fmctose-l,6-bisphosphatase; 
glucose-6-phosphate locator; glucokinase inhibitory regulatory protein; and 
phosphoenolpyruvate carboxykinase (PEPCK). 
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In one embodiment, the iRNA agent can be targeted to the liver, and RNA expression 
levels of the targeted genes are decreased in the liver following administration of the iRNA 

agent. > 

The iRNA agent can be one described herein, and can be a dsRNA that has a 
sequence that is substantially identical to a sequence of a target gene. The iRNA can be less 
than 30 nucleotides in length, e.g., 21-23 nucleotides. Preferably, the iRNA is 21 nucleotides 
in length. In one embodiment, the iRNA is 21 nucleotides in length, and the duplex region of 
the iRNA is 19 nucleotides. In another embodiment, the iRNA is greater than 30 nucleotides 
in length 

In another aspect, the invention features a method for reducing beta-catenin levels in 
a subject, e.g., a mammal, such as a human. The method includes administering to a subject 
an iRNA agent that tax-gets beta-catenin. The iRNA agent can be one described herein, and 
can be a dsRNA that has a sequence that is substantially identical to a sequence of the beta- 
catenin gene. The iRNA can be less than 30 nucleotides in length, e.g., 21-23 nucleotides. 
Preferably, the iRNA is 21 nucleotides in length. In one embodiment, the iRNA is 21 
nucleotides in length, and the duplex region of the iRNA is 19 nucleotides. In another 
embodiment, the iRNA is greater than 30 nucleotides in length. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences listed in Table 9. In a preferred embodiment it targets both sequences 
of a palindromic pair provided in Table 9. The most preferred targets are listed in 
descending order of preferrability, in other words, the more preferred targets are listed earlier 
in Table 9. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences listed in Table 9. In a preferred embodiment it targets both sequences 
of a palindromic pan provided in Table 9. The most preferred targets are listed in 
descending order of preferrability, in other words, the more preferred targets are listed earlier 
in Table 9. 

In a preferred embodiment the iRNA agent will include regions, or strands, which are 
complementary to a pair in Table 9. In a preferred embodiment the iRNA agent will include 
regions complementary to the palindromic pairs of Table 9as a duplex region. 
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In a preferred embodiment the duplex region of the iRNA agent will target a sequence 
listed in Table 9 but will not be perfectly complementary with the target sequence, e.g., it 
will not be complementary at at least 1 base pair. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere herein but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a haiipin, or by other non-base linkers. 

The iRNA agent that targets beta-catenin can be administered in an amount sufficient 
to reduce expression of beta-catenin mRNA. In one embodiment, the iRNA agent is 
administered in an amount sufficient to reduce expression of beta-catenin protein (e.g., by at 
least 2%, 4%, 6%, 10%, 15%, 20%). 

The iRNA agent that targets beta-catenin can be administered to a subject, wherein 
the subject is suffering from a disorder characterized by unwanted cellular proliferation in the 
liver or of liver tissue, e.g., metastatic tissue originating from the liver. Examples include , a 
benign or malignant disorder, e.g., a cancer, e.g., a hepatocellular carcinoma (HCC), hepatic 
metastasis, or hepatoblastoma. 

The iRNA agent can be administered to an individual at risk for the disorder to delay 
onset of the disorder or a symptom of the disorder 

In a preferred embodiment the iRNA agent is administered repeatedly. 
Administration of an iRNA agent can be carried out over a range of time periods. It can be 
administered daily, once every few days, weekly, or monthly. The timing of administration 
can vary from patient to patient, depending on such factors as the severity of a patient's 
symptoms. For example, an effective dose of an iRNA agent can be administered to a patient 
once a month for an indefinite period of time, or until the patient no longer requires therapy. 
In addition, sustained release compositions containing an iRNA agent can be used to 
maintain a relatively constant dosage in the patient's blood. 

In one embodiment, the iRNA agent can be targeted to the liver, and beta-catenin 
expression level are decreased in the liver following administration of the beta-catenin iRNA 
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agent. For example, the iRNA agent can be complexed with a moiety that targets the liver, 
e.g., an antibody or ligand that binds a receptor on the liver. 

In another aspect, the invention provides methods to treat liver disorders, e.g., 
disorders characterized by unwanted cell proliferation, hematological disorders, disorders 
characterized by inflammation disorders, and metabolic or viral diseases or disorders of the 
liver. A proliferation disorder of the liver can be, for example, a benign or malignant 
disorder, e.g., a cancer, e.g, a hepatocellular carcinoma (HCC), hepatic metastasis, or 
hepatoblastoma. A hepatic hematology or inflammation disorder can be a disorder involving 
clotting factors, a complement-mediated inflammation or a fibrosis, for example. Metabolic 
diseases of the liver can include dyslipidemias, and irregularities in glucose regulation. Viral 
diseases of the liver can include hepatitis C or hepatitis B. In one embodiment, a liver 
disorder is treated by administering one or more iRNA agents that have a sequence that is 
substantially identical to a sequence in a gene involved in the liver disorder. 

In one embodiment an iRNA agent to treat a liver disorder has a sequence which is 
substantially identical to a sequence of the beta-catenin or c-jun gene. In another 
embodiment, such as for the treatment of hepatitis C or hepatitis B, the iRNA agent can have 
a sequence that is substantially identical to a sequence of a gene of the hepatitis C virus or the 
hepatitis B virus, respectively. For example, the iRNA agent can target the 5' core region of 
HCV. This region lies just downstream of the ribosomal toe-print straddling the initiator 
methionine. Alternatively, an iRNA agent of the invention can target any one of the 
nonstructural proteins of HCV: NS3, 4A, 4B, 5A, or 5B. For the treatment of hepatitis B, an 
iRNA agent can target the protein X (HBx) gene, for example. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences listed in Table 10. In a preferred embodiment it targets both sequences 
of a palindromic pair provided in Table 10. Hie most preferred targets are listed in 
descending order of preferrability, in other words, the more preferred targets are listed earlier 
in Table 10. 

In a preferred embodiment the iRNA agent will include regions, or strands, which are 
complementary to a pair in Table 10. In a preferred embodiment the iRNA agent will 
include regions complementary to the palindromic pairs of Table 10 as a duplex region. 
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In a preferred embodiment the duplex region of the iRNA agent will target a sequence 
listed in Table 10, but will not be perfectly complementary with the target sequence, e.g., it 
will not be complementary at at least 1 base pair. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere herein but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a hairpin, or by other non-base linkers. 

In another aspect, an iRNA agent can be administered to modulate blood clotting, 
e.g., to reduce the tendency to form a blood clot. In a preferred embodiment the iRNA agent 
targets Factor V expression, preferably in the liver. One or more iRNA agents can be used to 
target a wild type allele, a mutant allele, e.g., the Leiden Factor V allele, or both. Such 
administration can be used to treat or prevent venous thrombosis, e.g., deep vein thrombosis 
or pulmonary embolism, or another disorder caused by elevated or otherwise unwanted 
expression of Factor V, in, e.g., the liver. In one embodiment the iRNA agent can treat a 
subject, e.g., a human who has Factor V Leiden or other genetic trait associated with an 
unwanted tendency to form blood clots. 

In a preferred embodiment administration of an iRNA agent which targets Factor V is 
with the administration of a second treatment, e.g, a treatment which reduces the tendency of 
the blood to clot, e.g., the administration of heparin or of a low molecular weight heparin. 

In one embodiment, the iRNA agent that targets Factor V can be used as a 
prophylaxis in patients, e.g., patients with Factor V Leiden, who are placed at risk for a 
thrombosis, e.g., those about to undergo surgery, in particular those about to undergo high- 
risk surgical procedures known to be associated with formation of venous thrombosis, those 
about to undergo a prolonged period of relative inactivity, e.g., on a motor vehicle, train or 
airplane flight, e.g., a flight or other trip lasting more than three or five hours. Such a 
treatment can be an adjunct to the therapeutic use of low molecular weight (LMW) heparin 
prophylaxis. 
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In another embodiment, the iRNA agent that targets Factor V can be administered to 
patients with Factor V Leiden to treat deep vein thrombosis (DVT) or pulmonary embolism 
(PE). Such a treatment can be an adjunct to (or can replace) therapeutic uses of heparin or 
Coumadin. The treatment can be administered by inhalation or generally by pulmonary 
routes. 

In a preferred embodiment, an iRNA agent administered to treat a liver disorder is 
targeted to the liver. For example, the iRNA agent can be complexed with a targeting 
moiety, e.g., an antibody or ligand that recognizes a liver-specific receptor. 

The invention also includes preparations, including substantially pure or 
pharmaceutically acceptable preparations of iRNA agents which silence any of the genes 
discussed herein and in particular for any of apoB-100, glucose-6-phosphatase, beta-catenin, 
factor V, or any of the HVC genes discussed herein. 

The methods and compositions of the invention, e.g., the methods and compositions 
to treat diseases and disorders of the liver described herein, can be used with any of the iRNA 
agents described. In addition, the methods and compositions of the invention can be used for 
the treatment of any disease or disorder described herein, and for the treatment of any 
subject, e.g., any animal, any mammal, such as any human. 

In another aspect, the invention features, a method of selecting two sequences or 
strands for use in an iRNA agent. The method includes: 

providing a first candidate sequence and a second candidate sequence; 
determining the value of a parameter which is a function of the number of 
palindromic pairs between the first and second sequence, wherein a palindromic pair is a 
nucleotide on said first sequence which, when the sequences are aligned in anti-parallel 
orientation, will hybridize with a nucleotide on said second sequence; 

comparing the number with a predetermined reference value, and if the number has 
a predetermined relationship with the reference, e.g., if it is the same or greater, selecting the 
sequences for use in an iRNA agent. In most cases each of the two sequences will be 
completely complementary with a target sequence (though as described elsewhere herein that 
may not always be the case, there may not be perfect complementarity with one or both of 
the target sequences) and will have sufficient complementarity with each other to form a 
duplex. The parameter can be derived e.g., by directly determining the number of 
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palindromic pairs, e.g., by inspection or by the use of a computer program which compares 
or analyses sequence. The parameter can also be determined less directly, and include e.g., 
calculation of or measurement of the Tm or other value related to the free energy of 
association or dissociation of a duplex. 

In a preferred embodiment the determination can be performed on a target sequence, 
e.g., a genomic sequence. In such embodiments the selected sequence is converted to its 
complement in the iRNA agent. 

In a preferred embodiment the first and second sequences are selected from the 
sequence of a single target gene. In other embodiments the first sequence is selected from 
the sequence of a first target gene and the second sequence is selected from the target of a 
second target gene. 

In a preferred embodiment the method includes comparing blocks of sequence, e.g., 
blocks which are between 15 and 25 nucleotides in length, and preferably 19, 20, or 21, and 
most preferably 19 nucleotides in length, to determine if they are suitable for use, e.g., if they 
possess sufficient palindromic pairs. 

In a preferred embodiment the first and second sequences are divided into a plurality 
of regions, e.g., terminal regions and a middle region disposed between the terminal regions 
and where in the reference value, or the predetermined relationship to the reference value, is 
different for at least two regions. E.g., the first and second sequences, when aligned in anti- 
parallel orientation, are divided into terminal regions each of a selected number of base pairs, 
e.g., 2, 3, 4, 5, or 6, and a middle region, and the reference value for the terminal regions is 
higher than for the middle regions. In other words, a higher number or proportion of 
palindromic pairs is required in the terminal regions. 

In a preferred embodiment the first and second sequences are gene sequences thus the 
complements of the sequences will be used in a iRNA agent. 

In a preferred embodiment hybridize means a classical Watson-Crick pairing. In other 
embodiments hybridize can include non-Watson-Crick paring, e.g., parings seen in micro 
RNA precursors. 

In a preferred embodiment the method includes the addition of nucleotides to form 
overhangs, e.g., 3' or 5' overhangs, preferably one or more 3' overhangs. Overhangs are 
discussed in detail elsewhere herein but are preferably about 2 nucleotides in length. The 
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overhangs can be complementary to the gene sequences being targeted or can be other 
sequence. TT is a preferred overhang sequence. The first and second iRNA agent sequences 
can also be joined , e.g., by additional bases to form a hairpin, or by other non-base linkers. 

In a preferred embodiment the method is used to select all or part of a iRNA agent. 
The selected sequences can be incorporated into an iRNA agent having any architecture, e.g., 
an architecture described herein. E.g., it can be incorporated into an iRNA agent having an 
overhang structure, overall length, hairpin vs. two-strand structure, as described herein. In 
addition, monomers other than naturally occurring ribonucleotides can be used in the selected 
iRNA agent. 

Preferred iRNA agents of this method will target genes expressed in the liver, e.g., 
one of the genes disclosed herein, e.g., apo B, Beta catenin, an HVC gene, or glucose 6 



In another aspect, the invention features, an iRNA agent, determined, made, or 
selected by a method described herein. 

The methods and compositions of the invention, e.g., the methods and iRNA 
compositions to treat liver-based diseases described herein, can be used with any dosage 
and/or formulation described herein, as well as with any route of administration described 
herein. 

The invention also provides for the use of an iRNA agent which includes monomers 
which can form other than a canonical Watson-Crick pairing with another monomer, e.g., a 
monomer on another strand. 

The use of "other than canonical Watson-Crick pairing" between monomers of a 
duplex can be used to control, often to promote, melting of all or part of a duplex. The iRNA 
agent can include a monomer at a selected or constrained position that results in a first level 
of stability in the iRNA agent duplex (e.g., between the two separate molecules of a double 
stranded iRNA agent) and a second level of stability in a duplex between a sequence of an 
iRNA agent and another sequence molecule, e.g., a target or off-target sequence in a subject. 
In some cases the second duplex has a relatively greater level of stability, e.g., in a duplex 
between an anti-sense sequence of an iRNA agent and a target mRNA. In this case one or 
more of the monomers, the position of the monomers in the iRNA agent, and the target 
sequence (sometimes referred to herein as the selection or constraint parameters), are 
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selected such that the iRNA agent duplex is has a comparatively lower free energy of 
association (which while not wishing to be bound by mechanism or theory, is believed to 
contribute to efficacy by promoting disassociation of the duplex iRNA agent in the context of 
the RISC) while the duplex formed between an anti-sense targeting sequence and its target 
sequence, has a relatively higher free energy of association (which while not wishing to be 
bound by mechanism or theory, is believed to contribute to efficacy by promoting association 
of the anti-sense sequence and the target RNA). 

In other cases the second duplex has a relatively lower level of stability, e.g., in a 
duplex between a sense sequence of an iRNA agent and an off-target mRNA. In this case 
one or more of the monomers, the position of the monomers in the iRNA agent, and an off- 
target sequence, are selected such that the iRNA agent duplex is has a comparatively higher 
free energy of association while the duplex formed between a sense targeting sequence and 
its off-target sequence, has a relatively lower free energy of association (which while not 
wishing to be bound by mechanism or theory, is believed to reduce the level of off-target 
silencing by contribute to efficacy by promoting disassociation of the duplex formed by the 
sense strand and the off-target sequence). 

Thus, inherent in the structure of the iRNA agent is the property of having a first 
stability for the intra-iRNA agent duplex and a second stability for a duplex formed between 
a sequence from the iRNA agent and another RNA, e.g., a target mRNA. As discussed 
above, this can be accomplished by judicious selection of one or more of the monomers at a 
selected or constrained position, the selection of the position in the duplex to place the 
selected or constrained position, and selection of the sequence of a target sequence (e.g., the 
particular region of a target gene which is to be targeted). The iRNA agent sequences which 
satisfy these requirements are sometimes referred herein as constrained sequences. Exercise 
of the constraint or selection parameters can be, e.g., by inspection, or by computer assisted 
methods. Exercise of the parameters can result in selection of a target sequence and of 
particular monomers to give a desired result in terms of the stability, or relative stability, of a 
duplex. 

Thus, in one aspect, the invention features, an iRNA agent which includes: a first 
sequence which targets a first target region and a second sequence which targets a second 
target region. The first and second sequences have sufficient complementarity to each other 
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to hybridize, e.g., under physiological conditions, e.g., under physiological conditions but not 
in contact with a helicase or other unwinding enzyme. In a duplex region of the iRNA agent, 
at a selected or constrained position, the first target region has a first monomer, and the 
second target region has a second monomer. The first and second monomers occupy 
complementary or corresponding positions. One, and preferably both monomers are selected 
such that the stability of the pairing of the monomers contribute to a duplex between the first 
and second sequence will differ form the stability of the pairing between the first or second 
sequence with a target sequence. 

Usually, the monomers will be selected (selection of the target sequence may be 
required as well) such that they form a pairing in the iRNA agent duplex which has a lower 
free energy of dissociation, and a lower Tm, than will be possessed by the paring of the 
monomer with its complementary monomer in a duplex between the iRNA agent sequence 
and a target RNA duplex. 

The constraint placed upon the monomers can be applied at a selected site or at more 
than one selected site. By way of example, the constraint can be applied at more than 1, but 
less than 3, 4, 5, 6, or 7 sites in an iRNA agent duplex. 

A constrained or selected site can be present at a number of positions in the iRNA 
agent duplex. E.g., a constrained or selected site can be present within 3, 4, 5, or 6 positions 
from either end, 3' or 5' of a duplexed sequence. A constrained or selected site can be 
present in the middle of the duplex region, e.g., it can be more than 3, 4, 5, or 6, positions 
from the end of a duplexed region. 

The iRNA agent can be selected to target a broad spectrum of genes, including any of 
the genes described herein. 

In a preferred embodiment the iRNA agent has an architecture (architecture refers to 
one or more of overall length, length of a duplex region, the presence, number, location, or 
length of overhangs, sing strand versus double strand form) described herein. 

E.g., the iRNA agent can be less than 30 nucleotides in length, e.g., 21-23 
nucleotides. Preferably, the iRNA is 21 nucleotides in length and there is a duplex region of 
about 19 pairs. In one embodiment, the iRNA is 21 nucleotides in length, and the duplex 
region of the iRNA is 19 nucleotides. In another embodiment, the iRNA is greater than 30 
nucleotides in length. 
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In some embodiment the duplex region of the iRNA agent will have, mismatches, in 
addition to the selected or constrained site or sites. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, which do not form canonical Watson-Crick pairs or which do not hybridize. 
Overhangs are discussed in detail elsewhere herein but are preferably about 2 nucleotides in 
length. The overhangs can be complementary to the gene sequences being targeted or can be 
other sequence. TT is a preferred overhang sequence. The fust and second iRNA agent 
sequences can also be joined, e.g., by additional bases to form a hairpin, or by other non-base 
linkers. 

The monomers can be selected such that: first and second monomers are naturally 
occurring ribonucleotides, or modified ribonucleotides having naturally occurring bases, and 
when occupying complementary sites either do not pair and have no substantial level of H- 
bonding, or form a non canonical Watson-Crick pairing and form a non-canonical pattern of 
H bonding, which usually have a lower free energy of dissociation than seen in a canonical 
Watson-Crick pairing, or otherwise pair to give a free energy of association which is less 
than that of a preselected value or is less, e.g., than that of a canonical pairing. When one (or 
both) of the iRNA agent sequences duplexes with a target, the first (or second) monomer 
forms a canonical Watson-Crick pairing with the base in the complementary position on the 
target, or forms a non canonical Watson-Crick pairing having a higher free energy of 
dissociation and a higher Tm than seen in the paring in the iRNA agent. The classical 
Watson-Crick parings are as follows: A-T, G-C, and A-U. Non-canonical Watson-Crick 
pairings are known in the art and can include, U-U, G-G, G-Atrans, G-Acis, and GU. 

The monomer in one or both of the sequences is selected such that, it does not pair, or 
forms a pair with its corresponding monomer in the other sequence which minimizes stability 
(e.g., the H bonding formed between the monomer at the selected site in the one sequence 
and its monomer at the corresponding site in the other sequence are less stable than the H 
bonds formed by the monomer one (or both) of the sequences with the respective target 
sequence. The monomer in one or both strands is also chosen to promote stability in one or 
both of the duplexes made by a strand and its target sequence. E.g., one or more of the 
monomers and the target sequences are selected such that at the selected or constrained 
position, there is are no H bonds formed, or a non canonical pairing is formed in the iRNA 
agent duplex, or otherwise they otherwise pair to give a free energy of association which is 



WO 2004/080406 



PCT/US2004/007070 



less than that of a preselected value or is less, e.g., than that of a canonical pairing, but when 
one ( or both) sequences form a duplex with the respective target, the pairing at the selected 
or constrained site is a canonical Watson-Crick pairing. 

The inclusion of such a monomers will have one or more of the following effects: it 
will destabilize the iRNA agent duplex, it will destabilize interactions between the sense 
sequence and unintended target sequences, sometimes referred to as off-target sequences, and 
duplex interactions between the a sequence and the intended target will not be destabilized. 

By way of example: 

the monomer at the selected site in the first sequence includes an A (or a modified 
base which pairs with T), and the monomer in at the selected position in the second sequence 
is chosen from a monomer which will not pair or which will form a non-canonical pairing, 
e.g., G. These will be useful in applications wherein the target sequence for the first 
sequence has a T at the selected position. In embodiments where both target duplexes are 
stabilized it is useful wherein the target sequence for the second strand has a monomer which 
will form a canonical Watson-Crick pairing with the monomer selected for the selected 
position in the second strand. 

the monomer at the selected site in the first sequence includes U (or a modified base 
which pairs with A), and the monomer in at the selected position in the second sequence is 
chosen from a monomer which will not pair or which will form a non-canonical pairing, e.g., 
U or G. These will be useful in applications wherein the target sequence for the first 
sequence has a T at the selected position. In embodiments where both target duplexes are 
stabilized it is useful wherein the target sequence for the second strand has a monomer which 
will form a canonical Watson-Crick pairing with the monomer selected for the selected 
position in the second strand. 

The monomer at the selected site in the first sequence includes a G (or a modified 
base which pairs with C), and the monomer in at the selected position in the second sequence 
is chosen from a monomer which will not pair or which will form a non-canonical pairing, 
e.g., G, Acis, Atrans, or U. These will be useful in applications wherein the target sequence 
for the first sequence has a T at the selected position. In embodiments where both target 
duplexes are stabilized it is useful wherein the target sequence for the second strand has a 
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monomer which will form a canonical Watson-Crick pairing with the monomer selected for 
the selected position in the second strand. 

The monomer at the selected site in the first sequence includes a C (or a modified 
base which pairs with G), and the monomer in at the selected position in the second sequence 
is chosen a monomer which will not pair or which will form a non-canonical pairing. These 
will be useful in applications wherein the target sequence for the first sequence has a T at the 
selected position. In embodiments where both target duplexes are stabilized it is useful 
wherein the target sequence for the second strand has a monomer which will form a 
canonical Watson-Crick pairing with the monomer selected for the selected position in the 
second strand. 

In another embodiment a non-naturally occurring or modified monomer or monomers 
are chosen such that when a non-naturally occurring or modified monomer occupies a 
positions at the selected or constrained position in an iRNA agent they exhibit a first free 
energy of dissociation and when one (or both) of them pairs with a naturally occurring 
monomer, the pair exhibits a second free energy of dissociation, which is usually higher than 
that of the pairing of the first and second monomers. E.g., when the first and second 
monomers occupy complementary positions they either do not pair and have no substantial 
level of H-bonding, or form a weaker bond than one of them would form with a naturally 
occurring monomer, and reduce the stability of that duplex, but when the duplex dissociates 
at least one of the strands will form a duplex with a target in which the selected monomer 
will promote stability, e.g., the monomer will form a more stable pair with a naturally 
occurring monomer in the target sequence than the pairing it formed in the iRNA agent. 

An example of such a pairing is 2-amino A and either of a 2-thio pyrimidine analog 
ofUorT. 

When placed in complementary positions of the iRNA agent these monomers will 
pair very poorly and will minimize stability. However, a duplex is formed between 2 amino 
A and the U of a naturally occurring target, or a duplex is between 2-thio U and the A of a 
naturally occurring target or 2-thio T and the A of a naturally occurring target will have a 
relatively higher free energy of dissociation and be more stable. This is shown in the FIG. 1 . 

The pair shown in FIG. 1 (the 2-amino A and the 2-s U and T) is exemplary. In 
another embodiment, the monomer at the selected position in the sense strand can be a 
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universal pairing moiety. A universal pairing agent will form some level of H bonding with 
more than one and preferably all other naturally occurring monomers. An example of a 
universal pairing moiety is a monomer which includes 3-nitro pyrrole. (Examples of other 
candidate universal base analogs can be found in the art, e.g., in Loakes, 2001, NAR 29: 
2437-2447, hereby incorporated by reference. Examples can also be found in the section on 
Universal Bases below.) In these cases the monomer at the corresponding position of the 
anti-sense strand can be chosen for its ability to form a duplex with the target and can 
include, e.g., A, U, G, or C. 

In another aspect, the invention features, an iRNA agent which includes: a sense 
sequence, which preferably does not target a sequence in a subject, and an anti-sense 
sequence, which targets a target gene in a subject. The sense and anti-sense sequences have 
sufficient complementarity to each other to hybridize hybridize, e.g., under physiological 
conditions, e.g., under physiological conditions but not in contact with a helicase or other 
unwinding enzyme. In a duplex region of the iRNA agent, at a selected or constrained 
position, the monomers are selected such that: | 

the monomer in the sense sequence is selected such that, it does not pair, or forms a 
pair with its corresponding monomer in the anti-sense strand which minimizes stability (e.g., 
the H bonding formed between the monomer at the selected site in the sense strand and its 
monomer at the corresponding site in the anti-sense strand are less stable than the H bonds 
formed by the monomer of the anti-sense sequence and its canonical Watson-Crick partner 
or, if the monomer in the anti-sense strand includes a modified base, the natural analog of the 
modified base and its canonical Watson-Crick partner); 

the monomer is in the corresponding position in the anti-sense stand is selected such 
that it maximizes the stability of a duplex it forms with the target sequence, e.g., it forms a 
canonical Watson-Crick paring with the monomer in the corresponding position on the target 
stand; 

optionally, the monomer in the sense sequence is selected such that, it does not pair, 
or forms a pair with its corresponding monomer in the anti-sense strand which minimizes 
stability with an off-target sequence. 

The inclusion of such a monomers will have one or more of the following effects: it 
will destabilize the iRNA agent duplex, it will destabilize interactions between the sense 
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sequence and unintended target sequences, sometimes referred to as off-target sequences, and 
duplex interactions between the anti-sense strand and the intended target will not be 
destabilized. 

The constraint placed upon the monomers can be applied at a selected site or at more 
than one selected site. By way of example, the constraint can be applied at more than 1, but 
less than 3, 4, 5, 6, or 7 sites in an iRNA agent duplex. 

A constrained or selected site can be present at a number of positions in the iRNA 
agent duplex. E.g., a constrained or selected site can be present within 3, 4, 5, or 6 positions 
from either end, 3' or 5' of a duplexed sequence. A constrained or selected site can be 
present in the middle of the duplex region, e.g., it can be more than 3, 4, 5, or 6, positions 
from the end of a duplexed region. 

The iRNA agent can be selected to target a broad spectrum of genes, including any of 
the genes described herein. 

In a preferred embodiment the iRNA agent has an architecture (architecture refers to 
one or more of overall length, length of a duplex region, the presence, number, location, or 
length of overhangs, sing strand versus double strand form) described herein. 

E.g., the iRNA agent can be less than 30 nucleotides in length, e.g., 21-23 
nucleotides. Preferably, the iRNA is 21 nucleotides in length and there is a duplex region of 
about 19 pairs. In one embodiment, the iRNA is 21 nucleotides in length, and the duplex 
region of the iRNA is 19 nucleotides. In another embodiment, the iRNA is greater than 30 
nucleotides in length. 

In some embodiment the duplex region of the iRNA agent will have, mismatches, in 
addition to the selected or constrained site or sites. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, which do not form canonical Watson-Crick pairs or which do not hybridize. 
Overhangs are discussed in detail elsewhere herein but are preferably about 2 nucleotides in 
length. The overhangs can be complementary to the gene sequences being targeted or can be 
other sequence. TT is a preferred overhang sequence. The first and second iRNA agent 
sequences can also be joined, e.g., by additional bases to form a hairpin, or by other non-base 
linkers. 

One or more selection or constraint parameters can be exercised such that: monomers 
at the selected site in the sense and anti-sense sequences are both naturally occurring 
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ribonucleotides, or modified ribonucleotides having naturally occurring bases, and when 
occupying complementary sites in the iRNA agent duplex either do not pair and have no 
substantial level of H-bonding, or form a non-canonical Watson-Crick pairing and thus form 
a non-canonical pattern of H bonding, which generally have a lower free energy of 
dissociation than seen in a Watson-Crick pairing, or otherwise pair to give a free energy of 
association which is less than that of a preselected value or is less, e.g., than that of a 
canonical pairing. When one, usually the anti-sense sequence of the iRNA agent sequences 
forms a duplex with another sequence, generally a sequence in the subject, and generally a 
target sequence, the monomer forms a classic Watson-Crick pairing with the base in the 
complementary position on the target, or forms a non-canonical Watson-Crick pairing having 
a higher free energy of dissociation and a higher Tm than seen in the paring in the iRNA 
agent. Optionally, when the other sequence of the iRNA agent, usually the sense sequences 
forms a duplex with another sequence, generally a sequence in the subject, and generally an 
off-target sequence, the monomer fails to forms a canonical Watson-Crick pairing with the 
base in the complementary position on the off target sequence, e.g., it forms or forms a non- 
canonical Watson-Crick pairing having a lower free energy of dissociation and a lower Tm. 
By way of example: 

the monomer at the selected site in the anti-sense stand includes an A (or a modified 
base which pairs with T), the corresponding monomer in the target is a T, and the sense 
strand is chosen from a base which will not pair or which will form a noncanonical pair, e.g.,, 
G; 

the monomer at the selected site in the anti-sense stand includes a U (or a modified 
base which pairs with A), the corresponding monomer in the target is an A, and the sense 
strand is chosen from a monomer which will not pair or which will form a non-canonical 
pairing, e.g., U or G; 

the monomer at the selected site in the anti-sense stand includes a C (or a modified 
base which pairs with G), the corresponding monomer in the target is a G, and the sense 
strand is chosen a monomer which will not pair or which will form a non-canonical pairing, 
e.g., G, A cis , At rans , or U; or 

the monomer at the selected site in the anti-sense stand includes a G (or a modified 
base which pairs with C), the corresponding monomer in the target is a C, and the sense 



WO 2004/080406 PCT/US2004/007070 

strand is chosen from a monomer which will not pair or which will form a non-canonical 
pairing. 

In another embodiment a non-naturally occurring or modified monomer or monomers 
is chosen such that when it occupies complementary a position in an iRNA agent they exhibit 
a first free energy of dissociation and when one (or both) of them pairs with a naturally 
occurring monomer, the pair exhibits a second free energy of dissociation, which is usually 
higher than that of the pairing of the first and second monomers. E.g., when the first and 
second monomers occupy complementary positions they either do not pair and have no 
substantial level of H-bonding, or form a weaker bond than one of them would form with a 
naturally occurring monomer, and reduce the stability of that duplex, but when the duplex 
dissociates at least one of the strands will form a duplex with a target in which the selected 
monomer will promote stability, e.g., the monomer will form a more stable pair with a 
naturally occurring monomer in the target sequence than the pairing it formed in the iRNA 
agent. 

An example of such a pairing is 2-amino A and either of a 2-thio pyrimidine analog 
of U or T. As is discussed above, when placed in complementary positions of the iRNA 
agent these monomers will pair very poorly and will minimize stability. However, a duplex 
is formed between 2 amino A and the U of a naturally occurring target, or a duplex is formed 
between 2-thio U and the A of a naturally occurring target or 2-thio T and the A of a 
naturally occurring target will have a relatively higher free energy of dissociation and be 
more stable. 

The monomer at the selected position in the sense strand can be a universal pairing 
moiety. A universal pairing agent will form some level of H bonding with more than one and 
preferably all other naturally occurring monomers. An examples of a universal pairing 
moiety is a monomer which includes 3-nitro pyrrole. Examples of other candidate universal 
base analogs can be found in the art, e.g., in Loakes, 2001, NAR 29: 2437-2447, hereby 
incorporated by reference. In these cases the monomer at the corresponding position of the 
anti-sense strand can be chosen for its ability to form a duplex with the target and can 
include, e.g., A, U, G, or C. 

In another aspect, the invention features, an iRNA agent which includes: 
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a sense sequence, which preferably does not target a sequence in a subject, and an anti-sense 
sequence, which targets a plurality of target sequences in a subject, wherein the targets differ 
in sequence at only 1 or a small number, e.g., no more than 5, 4, 3 or 2 positions. The sense 
and anti-sense sequences have sufficient complementarity to each other to hybridize, e.g., 
under physiological conditions, e.g., under physiological conditions but not in contact with a 
helicase or other unwinding enzyme. In the sequence of the anti-sense strand of the iRNA 

agent is selected such that at one, some, or all of the positions which correspond to positions 
that differ in sequence between the target sequences, the anti-sense strand will include a 

monomer which will form H-bonds with at least two different target sequences. In a 

preferred example the anti-sense sequence will include a universal or promiscuous monomer, 

e.g., a monomer which includes 5-nitro pyrrole, 2-amino A, 2-thio U or 2-thio T, or other 

universal base referred to herein. 

In a preferred embodiment the iRNA agent targets repeated sequences (which differ 

at only one or a small number of positions from each other) in a single gene, a plurality of 

genes, or a viral genome, e.g., the HCV genome. 

An embodiment is illustrated in the FIGs. 2 and 3. 

In another aspect, the invention features, determining, e.g., by measurement or 
calculation, the stability of a pairing between monomers at a selected or constrained position 
in the iRNA agent duplex, and preferably determining the stability for the corresponding 
pairing in a duplex between a sequence form the iRNA agent and another RNA, e.g., a target 
sequence. The detenninations can be compared. An iRNA agent thus analyzed can be used 
in the development of a further modified iRNA agent or can be administered to a subject. 
This analysis can be performed successively to refine or design optimized iRNA agents. 

In another aspect, the invention features, a kit which includes one or more of the 
following an iRNA described herein, a sterile container in which the iRNA agent is 
disclosed, and instructions for use. 

In another aspect, the invention features, an iRNA agent containing a constrained 
sequence made by a method described herein. The iRNA agent can target one or more of the 
genes referred to herein. 

iRNA agents having constrained or selected sites, e.g., as described herein, can be 
used in any way described herein. Accordingly, they iRNA agents having constrained or 
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selected sites, e.g., as described herein, can be used to silence a target, e.g., in any of the 
methods described herein and to target any of the genes described herein or to treat any of the 
disorders described herein. iRNA agents having constrained or selected sites, e.g., as 
described herein, can be incorporated into any of the formulations or preparations, e.g., 
pharmaceutical or sterile preparations described herein. iRNA agents having constrained or 
selected sites, e.g., as described herein, can be administered by any of the routes of 
administration described herein. 

The term "other than canonical Watson-Crick pairing" as used herein, refers to a 
pairing between a first monomer in a first sequence and a second monomer at the 
corresponding position in a second sequence of a duplex in which one or more of the 
following is true: (1) there is essentially no pairing between the two, e.g., there is no 
significant level of H bonding between the monomers or binding between the monomers 
does not contribute in any significant way to the stability of the duplex; (2) the monomers are 
a non-canonical paring of monomers having a naturally occurring bases, i.e., they are other 
than A-T, A-U, or G-C, and they form monomer-monomer H bonds, although generally the 
H bonding pattern formed is less strong than the bonds formed by a canonical pairing; or (3) 
at least one of the monomers includes a non-naturally occurring bases and the H bonds 
formed between the monomers is, preferably formed is less strong than the bonds formed by 
a canonical pairing, namely one or more of A-T, A-U, G-C. 

The term "off-target" as used herein, refers to a sequence other than the sequence to 
be silenced. 

Universal Bases: "wild-cards" ; shape-based complementarity 

Bi-stranded, multisite replication of a base pair between difluorotoluene and adenine: confirmation by 
'inverse' sequencing. Liu, D.; Moran, S.; Kool, E. T. Chem. Biol, 1997, 4, 919-926) 
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(Importance of terminal base pair hydrogen-bonding in 3'-end proofreading by the Klenow fragment 
of DNA polymerase I. Morales, J. C; Kool, E. T. Biochemist^, 2000, 39, 2626-2632) 



(Selective and stable DNA base pairing without hydrogen bonds. Matray, T, J.; Kool, E. T. J. Am. 
Chem. Soc, 1998, 120, 6191-6192) 




(Difluorotoluene, a nonpolar isostere for thymine, codes specifically and efficiently for adenine in 
DNA replication. Moran, S. Ren, R. X.-F.; Rumney IV, S.; Kool, E. T. J. Am. Chem. Soc, 1997, 119, 2056- 
10 2057) 
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(Structure and base pairing properties of a replicable nonpolar isostere for deoxyadenosine. Guckian, 
K. M.; Morales, J. C; Kool, E. T. J. Org. Chem., 1998, 63, 9652-9656) 
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(Universal bases for hybridization, replication and chain termination. Berger, M.; Wu. Y.; Ogawa, A. 
K.; McMinn, D. L.; Schultz, P.G.; Romesberg, F. E. Nucleic Acids Res., 2000, 28, 291 1-2914) 




Efforts toward the expansion of the genetic alphabet: Information storage and replication with unnatural 
hydrophobic base pairs. Ogawa, A. K, Wu, Y .; McMinn, D. L.; Liu, J.; Schultz, P. G.; Romesberg, F. E.J. 
Am. Chem. Soc, 2000, 122, 3274-3287. 2. Rational design of an unnatural base pair with increased kinetic 
selectivity. Ogawa, A. K.; Wu. Y.; Berger, M.; Schultz, P. O.; Romesberg, F. E. J. Am. Chem. Soc, 2000, 
122, 8803-8804) 
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(Efforts toward expansion of the genetic alphabet: replication of DNA with three base pairs. Tae, E. L.; 
Wu, Y.; Xia, G.; Schultz, P. G.; Romesberg, F. E. J. Am. Chem. Soc, 2001, 123, 7439-7440) 

(1 . Efforts toward expansion of the genetic alphabet: Optimization of interbase hydrophobic 
interactions. Wu, Y.; Ogawa, A. K .; Berger, M.; McMinn, D. L.; Schultz, P. G .; Romesberg, F. E. J. Am. Chem. 
Soc, 2000, 122, 7621-7632. 2. Efforts toward expansion of genetic alphabet: DNA polymerase recognition of a 
highly stable, self-pairing hydrophobic base. McMinn, D. L.; Ogawa. A. K.; Wu, Y.; Liu, J.; Schultz, P. G.; 
Romesberg, F. E. J. Am. Chem. Soc, 1999, 121, 11585-11586) 

(A stable DNA duplex containing a non-hydrogen-bonding and non-shape complementary base 
couple: Interstrand stacking as the stability determining factor. Brotschi, C; Haberli, A.; Leumann, C, J. Angew. 
Chem. Int. Ed, 2001, 40, 3012-3014) 

(2,2'-Bipyridine Ligandoside: A novel building block for modifying DNA with intra-duplex metal 
complexes. Weizman, H.; Tor, Y. J. Am. Chem. Soc, 2001, 123, 3375-3376) 




(Minor groove hydration is critical to the stability of DNA duplexes. Lan, T.; McLaughlin, L. W. J. 
Am. Chem. Soc, 2000, 122, 6512-13) 
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(Effect of the Universal base 3-nitropyrrole on the selectivity of neighboring natural bases. Oliver, 
S.; Parker, K. A.; Suggs, J. W. Organic Lett., 2001, 3, 1977-1980. 2. Effect of the l-(2'-deoxy-P-D- 
ribofuranosyl)-3-nitropyrrol residue on the stability of DNA duplexes and triplexes. Amosova, O.; George J 
Fresco, J. R. Nucleic Acids Res., 1997, 25, 1930-1934. 3. Synthesis, structure and deoxyribonucleic acid 
5 sequencing with a universal nucleosides: 1 -(2'-deoxy-p-D-ribofuranosyl)-3-nitropyrrole. Bergstrom, D. E.; 
Zhang, P.; Toma, P. H.; Andrews, P. C; Nichols, R. J. Am. Chem. Soc, 1995, 117, 1201-1209) 



bu " y YYY 



(Model studies directed toward a general triplex DNA recognition scheme: a novel DNA base that 
binds a CG base-pair in an organic solvent. Zimmerman, S. C; Schmitt, P. J. Am. Chem. Soc, 1995, 117, 
10769-10770) 




(A universal, photocleavable DNA base: nitropiperonyl 2'-deoxyriboside. J. Org. Chem., 2001, 66, 
2067-2071) 
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(Recognition of a single guanine bulge by 2-acylamino-l,8-naphthyridine. Nakatani, K.; Sando, S.; 
Saito, I. J. Am. Chem. Soc, 2000, 122, 2172-2177. b. Specific binding of 2-amino-l,8-naphthyridine into single 
guanine bulge as evidenced by photooxidation of GC doublet, Nakatani, K.; Sando, S.; Yoshida, K.; Saito, I. 
Bioorg. Med. Chem. Lett, 2001, 11, 335-337) 
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Other universal bases can have the following formulas: 
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R 44 is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH 2 , NHR b , or NR b R c , C r 
C 6 alkyl, C 6 -C 10 aryl, C 6 -Ci 0 heteroaryl, C 3 -C 8 heterocyclyl, or when taken together with R 45 
forms -OCH2O-; 

R 45 is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH 2 , NHR b , or NR^, C r 
C 6 alkyl, C 6 -C 10 aryl, C 6 -C 10 heteroaryl, C 3 -C 8 heterocyclyl, or when taken together with R 44 
or R 46 forms -0CH 2 O; 

R 46 is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH 2 , NHR b , or NR b R c , C r 
C 6 alkyl, C 6 -C 10 aryl, C 6 -C 10 heteroaryl, C 3 -C 8 heterocyclyl, or when taken together with R 45 
or R 47 forms -OCH 2 0-; 

R 47 is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH 2 , NHR b , or NR R c , C r 
C 6 alkyl, C 6 -C 10 aryl, C 6 -C 10 heteroaryl, C 3 -C 8 heterocyclyl, or when taken together with R 46 
or R 48 forms -OCH20-; 

R 48 is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH 2 , NHR b , or NR R c , C r 
C 6 alkyl, C 6 -C 10 aryl, C 6 -C 10 heteroaryl, C 3 -C 8 heterocyclyl, or when taken together with R 47 
forms -OCH20-; 

R 49 R 50 , R 51 , R 52 , R 53 , R 54 , R 57 , R 58 , R 59 , R 60 , R 61 , R 62 , R 63 , R 64 , R 65 , R 66 , R 6? , R 68 > *- 69 , 
R 70 , R 71 , and R 72 are each independently selected from hydrogen, halo, hydroxy, nitro, 
protected hydroxy, NH 2 , NHR b , or NR b R°, C,-C 6 alkyl, C 2 -C 6 alkynyl, C 6 -C 10 aryl, C 6 -C 10 
heteroaryl, C 3 -C 8 heterocyclyl, NC(0)R 17 , or NC(O)R 0 ; 

R 55 is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH 2 , NHR b , or NR b R°, C r 
C 6 alkyl, C 2 -C 6 alkynyl, C 6 -C 10 aryl, C 6 -C 10 heteroaryl, C 3 -C 8 heterocyclyl, NC(0)R 17 , or 
NC(O)R 0 , or when taken together with R 56 forms a fused aromatic ring which may be 
optionally substituted; 

R 56 is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH 2 , NHR b , or NR b R°, d- 
C 6 alkyl, C 2 -C 6 alkynyl, C 6 -C 10 aryl, C 6 -C I0 heteroaryl, C 3 -C 8 heterocyclyl, NC(0)R 17 , or 
NC(0)R°, or when taken together with R 55 forms a fused aromatic ring which may be 
optionally substituted; 

R 17 is halo, NH 2 , NHR b , or NR b R c ; 

R b is C r C 6 alkyl or a nitrogen protecting group; 

R° is Ci-C 6 alkyl; and 
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R° is alkyl optionally substituted with halo, hydroxy, nitro, protected hydroxy, NH 2 , 
NHR b , or NR 1 ^ 0 , d-C« alkyl, C 2 -C 6 alkynyl, C 6 -C 10 aryl, C 6 -C 10 heteroaryl, C 3 -C 8 
heterocyclyl, NC(0)R 17 , or NC(0)R°. 

Examples of universal bases include: 



F CH 3 ^H 2 NH 2 0 2 N^^ 

taCvl N^N \ I X > 




39 



WO 2004/080406 PCT/US2004/007070 

In one aspect, the invention features methods of producing iRNA agents, e.g., sRNA 
agents, e.g. an sRNA agent described herein, having the ability to mediate RNAi. These 
iRNA agents can be formulated for admimstration to a subject. 

In another aspect, the invention features a method of administering an iRNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, to a subject {e.g., a human subject). The 
method includes administering a unit dose of the iRNA agent, e.g., a sRNA agent, e.g., 
double stranded sRNA agent that (a) the double-stranded part is 19-25 nucleotides (nt) long, 
preferably 21-23 nt, (b) is complementary to a target RNA {e.g., an endogenous or pathogen 
target RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nucleotide long. In 
one embodiment, the unit dose is less than 1 .4 mg per kg of bodyweight, or less than 1 0, 5, 2, 
1, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, 0.00005 or 0.00001 mg per kg of 
bodyweight, and less than 200 nmole of RNA agent {e.g. about 4.4 x 10 16 copies) per kg of 
bodyweight, or less than 1500, 750, 300, 150, 75, 15, 7.5, 1.5, 0.75, 0.15, 0.075, 0.015, 
0.0075, 0.0015, 0.00075, 0.00015 nmole of RNA agent per kg of bodyweight. 

The defined amount can be an amount effective to treat or prevent a disease or 
disorder, e.g., a disease or disorder associated with the target RNA. The unit dose, for 
example', can be administered by injection {e.g., intravenous or intramuscular), an inhaled 
dose, or a topical application. Particularly preferred dosages are less than 2, 1, or 0.1 mg/kg 
of body weight. 

In a preferred embodiment, the unit dose is administered less frequently than once a 
day, e.g., less than every 2, 4, 8 or 30 days. In another embodiment, the unit dose is not 
administered with a frequency {e.g., not a regular frequency). For example, the unit dose 
may be administered a single time. 

In one embodiment, the effective dose is administered with other traditional 
therapeutic modalities. In one embodiment, the subject has a viral infection and the modality 
is an antiviral agent other than an iRNA agent, e.g., other than a double-stranded iRNA 
agent, or sRNA agent. In another embodiment, the subject has atherosclerosis and the 
effective dose of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, is 
administered in combination with, e.g., after surgical intervention, e.g., angioplasty. 

In one embodiment, a subject is administered an initial dose and one or more 
maintenance doses of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
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(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof). The maintenance dose or doses are generally lower than the initial dose, 
e.g., one-half less of the initial dose. A maintenance regimen can include treating the subject 
with a dose or doses ranging from 0.01 ug to 1.4 mg/kg of body weight per day, e.g., 10, 1, 
0.1, 0.01, 0.001, or 0.00001 mg per kg of bodyweight per day. The maintenance doses are 
preferably administered no more than once every 5, 10, or 30 days. 

In one embodiment, the iRNA agent pharmaceutical composition includes a plurality 
of iRNA agent species. In another embodiment, the iRNA agent species has sequences that 
are non-overlapping and non-adjacent to another species with respect to a naturally occurring 
target sequence. In another embodiment, the plurality of iRNA agent species is specific for 
different naturally occurring target genes. In another embodiment, the iRNA agent is allele 
specific. 

The inventors have discovered that iRNA agents described herein can be administered 
to mammals, particularly large mammals such as nonhuman primates or humans in a number 
of ways. 

In one embodiment, the administration of the iRNA agent, e.g., a double-stranded 
iRNA agent, or sRNA agent, composition is parenteral, e.g. intravenous (e.g., as a bolus or as 
a diffusible infusion), intradermal, intraperitoneal, intramuscular, intrathecal, intraventriculai-, 
intracranial, subcutaneous, transmucosal, buccal, sublingual, endoscopic, rectal, oral, vaginal, 
topical, pulmonary, intranasal, urethral or ocular. Administration can be provided by the 
subject or by another person, e.g., a health care provider. The medication can be provided in 
measured doses or in a dispenser that delivers a metered dose. Selected modes of delivery 
are discussed in more detail below. 

The invention provides methods, compositions, and kits, for rectal administration or 
delivery of iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes a an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
or precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA 
agent described herein, e.g., a iRNA agent having a double stranded region of less than 40, 
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and preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3' 
overhangs can be administered rectally, e.g., introduced through the rectum into the lower or 
upper colon. This approach is particularly useful in the treatment of, inflammatory disorders, 
disorders characterized by unwanted cell proliferation, e.g., polyps, or colon cancer. 

In some embodiments the medication is delivered to a site in the colon by introducing 
a dispensing device, e.g., a flexible, camera-guided device similar to that used for inspection 
of the colon or removal of polyps, which includes means for delivery of the medication. 

In one embodiment, the rectal administration of the iRNA agent is by means of an 
enema. The iRNA agent of the enema can be dissolved in a saline or buffered solution. 

In another embodiment, the rectal administration is by means of a suppository. The 
suppository can include other ingredients, e.g., an excipient, e.g., cocoa butter or 
hydropropylmethylcellulose . 

The invention also provides methods, compositions, and kits for oral delivery of 
iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA 
described herein, e.g., a iRNA agent having a double stranded region of less than 40 and 
preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3' 
overhangs can be administered orally. 

Oral administration can be in the form of tablets, capsules, gel capsules, lozenges, 
troches or liquid syrups. In a preferred embodiment the composition is applied topically to a 
surface of the oral cavity. 
5 The invention also provides methods, compositions, and kits for buccal delivery of 

iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
o precursor thereof) described herein, e.g., a therapeutically effective amount of iRNA agent 
having a double stranded region of less than 40 and preferably less than 30 nucleotides and 



WO 2004/080406 PCT/US2004/007070 

having one or two 1-3 nucleotide single strand 3' overhangs can be administered to the buccal 
cavity. The medication can be sprayed into the buccal cavity or applied directly, e.g., in a 
liquid, solid, or gel form to a surface in the buccal cavity. This administration is particularly 
desirable for the treatment of inflammations of the buccal cavity, e.g., the gums or tongue, 
e.g., in one embodiment, the buccal administration is by spraying into the cavity, e.g., 
without inhalation, from a dispenser, e.g., a metered dose spray dispenser that dispenses the 
pharmaceutical composition and a propellant. 

The invention also provides methods, compositions, and kits for ocular delivery of 
iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA agent 
described herein, e.g., a sRNA agent having a double stranded region of less than 40 and 
preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3' 
overhangs can be administered to ocular tissue. 

The medications can be applied to the surface of the eye or nearby tissue, e.g., the 
inside of the eyelid. It can be applied topically, e.g., by spraying, in drops, as an eyewash, or 
an ointment. Administration can be provided by the subject or by another person, e.g., a 
health care provider. The medication can be provided in measured doses or in a dispenser 
that delivers a metered dose. 

The medication can also be administered to the interior of the eye, and can be 
introduced by a needle or other delivery device which can introduce it to a selected area or 
structure. 

Ocular treatment is particularly desirable for treating inflammation of the eye or 
nearby tissue. 

The invention also provides methods, compositions, and kits for delivery of iRNA 
agents described herein to or through the skin. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
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precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA agent 
described herein, e.g., a sKNA agent having a double stranded region of less than 40 and 
preferably less than 30 nucleotides and one or two 1-3 nucleotide single strand 3' overhangs 
can be administered directly to the skin. 

The medication can be applied topically or delivered in a layer of the skin, e.g., by the 
use of a microneedle or a battery of microneedles which penetrate into the skin, but 
preferably not into the underlying muscle tissue. 

In one embodiment, the administration of the iRNA agent composition is topical. In 
another embodiment, topical administration delivers the composition to the dermis or 
epidermis of a subject. In other embodiments the topical administration is in the form of 
transdermal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids or 
powders. A composition for topical administration can be formulated as a liposome, micelle, 
emulsion, or other lipophilic molecular assembly. 

In another embodiment, the transdermal administration is applied with at least one 
penetration enhancer. In other embodiments, the penetration can be enhanced with 
iontophoresis, phonophoresis, and sonophoresis. In another aspect, the invention provides 
methods, compositions, devices, and kits for pulmonary delivery of iRNA agents described 



Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amount of iRNA agent, 
e.g., a sRNA agent having a double stranded region of less than 40, preferably less than 30 
nucleotides and having one or two 1-3 nucleotide single stand 3' overhangs can be 
administered to the pulmonary system. Pulmonary administration can be achieved by 
inhalation or by the introduction of a delivery device into the pulmonary system, e.g., by 
introducing a delivery device which can dispense the medication. 

The preferred method of pulmonary delivery is by inhalation. The medication can be 
provided in a dispenser which delivers the medication, e.g., wet or dry, in a form sufficiently 
small such that it can be inhaled. The device can deliver a metered dose of medication. The 
subject, or another person, can administer the medication. 
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Pulmonary delivery is effective not only for disorders which directly affect 
pulmonary tissue, but also for disorders which affect other tissue. 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or 
aerosol for pulmonary delivery. 

In another aspect, the invention provides methods, compositions, devices, and kits for 
nasal delivery of iRNA agents described herein. Accordingly, an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) described herein, e.g., a 
therapeutically effective amount of iRNA agent, e.g., a sRNA agent having a double stranded 
region of less than 40 and preferably less than 30 nucleotides and having one or two 1-3 
nucleotide single strand 3' overhangs can be administered nasally. Nasal administration can 
be achieved by introduction of a delivery device into the nose, e.g., by introducing a delivery 
device which can dispense the medication. 

The preferred method of nasal delivery is by spray, aerosol, liquid, e.g., by drops, of 
by topical administration to a surface of the nasal cavity. The medication can be provided in 
a dispenser which delivery of the medication, e.g., wet or dry, in a form sufficiently small 
such that it can be inhaled. The device can deliver a metered dose of medication. The 
subject, or another person, can administer the medication. 

Nasal delivery is effective not only for disorders which directly affect nasal tissue, but 
also for disorders which affect other tissue 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or for 
nasal delivery. 

In another embodiment, the iRNA agent is packaged in a viral natural capsid or in a 
chemically or enzymatically produced artificial capsid or structure derived therefrom. 

In one aspect, of the invention, the dosage of a pharmaceutical composition including 
a iRNA agent is administered in order to alleviate the symptoms of a disease state, e.g., 
cancer or a cardiovascular disease. 

In another aspect, gene expression in a subject is modulated by administering a 
pharmaceutical composition including a iRNA agent. In other embodiments, a subject is 
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treated with the pharmaceutical composition by any of the methods mentioned above. In 
another embodiment, the subject has cancer. 

An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) composition can be administered as a liposome. For example, the 
composition can be prepared by a method that includes: (1) contacting a iRNA agent with an 
amphipathic cationic lipid conjugate in the presence of a detergent; and (2) removing the 
detergent to form a iRNA agent and cationic lipid complex. In one embodiment, the 
detergent is cholate, deoxycholate, lauryl sarcosine, octanoyl sucrose, CHAPS (3-[(3- 
cholamidopropyl)-di-methylamine]-2-hydroxyl-l-propane),novel-|3-D-glucopyranoside, 
lauryl dimethylamine oxide, or octylglucoside. The iRNA agent can be an sRNA agent. The 
method can include preparing a composition that includes a plurality of iRNA agents, e.g., 
specific for one or more different endogenous target RNAs. The method can include other 
features described herein. 

In another aspect, a subject is treated by administering a defined amount of an iRNA 
agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger 
iRNA agent which can be processed into a sRNA agent) composition that is in a powdered 
form. In one embodiment, the powder is a collection of microparticles. In one embodiment, 
the powder is a collection of crystalline particles. The composition can include a plurality of 
iRNA agents, e.g., specific for one or more different endogenous target RNAs. The method 
can include other features described herein. 

In one aspect, a subject is treated by administering a defined amount of a iRNA agent 
composition that is prepared by a method that includes spray-drying, i.e. atomizing a liquid 
solution, emulsion, or suspension, immediately exposing the droplets to a drying gas, and 
collecting the resulting porous powder particles. The composition can include a plurality of 
iRNA agents, e.g., specific for one or more different endogenous target RNAs. The method 
can include other features described herein. 

In one aspect, the iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
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precursor thereof), is provided in a powdered, crystallized or other finely divided form, with 
or without a carrier, e.g., a micro- or nano-particle suitable for inhalation or other pulmonary 
delivery. In one embodiment, this includes providing an aerosol preparation, e.g., an 
aerosolized spray-dried composition. The aerosol composition can be provided in and/or 
dispensed by a metered dose delivery device. 

In another aspect, a subject is treated for a condition treatable by inhalation. In one 
embodiment, this method includes aerosolizing a spray-dried iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) composition and inhaling the 
aerosolized composition. The iRNA agent can be an sRNA. The composition can include a 
plurality of iRNA agents, e.g., specific for one or more different endogenous target RNAs. 
The method can include other features described herein. 

In another aspect, the invention features a method of treating a subject that includes: 
administering a composition including an effective/defined amount of an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof), wherein the composition 
is prepared by a method that includes spray-drying, lyophilization, vacuum drying, 
evaporation, fluid bed drying, or a combination of these techniques 

In another aspect, the invention features a method that includes: evaluating a 
parameter related to the abundance of a transcript in a cell of a subject; comparing the 
evaluated parameter to a reference value; and if the evaluated parameter has a preselected 
relationship to the reference value {e.g., it is greater), administering a iRNA agent (or a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes a iRNA agent or precursor thereof) to the subject. In one embodiment, the 
iRNA agent includes a sequence that is complementary to the evaluated transcript. For 
example, the parameter can be a direct measure of transcript levels, a measure of a protein 
level, a disease or disorder symptom or characterization {e.g., rate of cell proliferation and/or 
tumor mass, viral load,) 
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In another aspect, the invention features a method that includes: administering a first 
amount of a composition that comprises an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, or precursor thereof) to a subject, wherein the iRNA agent includes a strand 
substantially complementary to a target nucleic acid; evaluating an activity associated with a 
protein encoded by the target nucleic acid; wherein the evaluation is used to determine if a 
second amount should be administered. In a preferred embodiment the method includes 
administering a second amount of the composition, wherein the timing of administration or 
dosage of the second amount is a function of the evaluating. The method can include other 
featiures described herein. 

In another aspect, the invention features a method of administering a source of a 
double-stranded iRNA agent (ds iRNA agent) to a subject. The method includes 
administering or implanting a source of a ds iRNA agent, e.g., a sRNA agent, that (a) 
includes a double-stranded region that is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to a target RNA {e.g., an endogenous RNA or a pathogen 
RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nt long. In one embodiment, 
the source releases ds iRNA agent over time, e.g. the source is a controlled or a slow release 
source, e.g., a microparticle that gradually releases the ds iRNA agent. In another 
embodiment, the source is a pump, e.g., a pump that includes a sensor or a pump that can 
release one or more unit doses. 

In one aspect, the invention features a pharmaceutical composition that includes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) 
including a nucleotide sequence complementary to a target RNA, e.g., substantially and/or 
exactly complementary. The target RNA can be a transcript of an endogenous human gene. 
In one embodiment, the iRNA agent (a) is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to an endogenous target RNA, and, optionally, (c) includes 
at least one 3' overhang 1-5 nt long. In one embodiment, the pharmaceutical composition can 
be an emulsion, microemulsion, cream, jelly, or liposome. 
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In one example the pharmaceutical composition includes an iRNA agent mixed with a 
topical delivery agent. The topical delivery agent can be a plurality of microscopic vesicles. 
The microscopic vesicles can be liposomes. In a preferred embodiment the liposomes are 
cationic liposomes. 

In another aspect, the pharmaceutical composition includes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof) admixed with a topical 
penetration enhancer. In one embodiment, the topical penetration enhancer is a fatty acid. 
The fatty acid can be arachidonic acid, oleic acid, lauric acid, caprylic acid, capric acid, 
myristic acid, palmitic acid, stearic acid, linoleic acid, linolenic acid, dicaprate, tricaprate, 
monolein, dilaurin, glyceryl 1-monocaprate, l-dodecylazacycloheptan-2-one, an 
acylcarnitine, an acylcholine, or a C,, 0 alkyl ester, monoglyceride, diglyceride or 
pharmaceutically acceptable salt thereof. 

In another embodiment, the topical penetration enhancer is a bile salt. The bile salt 
can be cholic acid, dehydrocholic acid, deoxycholic acid, glucholic acid, glycholic acid, 
glycodeoxycholic acid, taurocholic acid, taurodeoxycholic acid, chenodeoxycholic acid, 
ursodeoxycholic acid, sodium tauro-24,25-dihydro-fusidate, sodium glycodihydrofusidate, 
polyoxyethylene-9-lauryl ether or a pharmaceutically acceptable salt thereof. 

In another embodiment, the penetration enhancer is a chelating agent. The chelating 
agent can be EDTA, citric acid, a salicyclate, aN-acyl derivative of collagen, laureth-9, an 
N-amino acyl derivative of a beta-diketone or a mixture thereof. 

In another embodiment, the penetration enhancer is a surfactant, e.g., an ionic or 
nonionic surfactant. The surfactant can be sodium lauryl sulfate, polyoxyethylene-9-lauryl 
ether, polyoxyethylene-20-cetyl ether, a perfluorchemical emulsion or mixture thereof. 

In another embodiment, the penetration enhancer can be selected from a group 
consisting of unsaturated cyclic ureas, 1-alkyl-alkones, 1-alkenylazacyclo-alakanones, 
steroidal anti-inflammatory agents and mixtures thereof. In yet another embodiment the 
penetration enhancer can be a glycol, a pyrrol, an azone, or a terpenes. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
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larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
form suitable for oral delivery. In one embodiment, oral delivery can be used to deliver an 
iRNA agent composition to a cell or a region of the gastro-intestinal tract, e.g., small 
intestine, colon (e.g., to treat a colon cancer), and so forth. The oral delivery form can be 
tablets, capsules or gel capsules. In one embodiment, the iRNA agent of the pharmaceutical 
composition modulates expression of a cellular adhesion protein, modulates a rate of cellular 
proliferation, or has biological activity against eukaryotic pathogens or retroviruses. In 
another embodiment, the pharmaceutical composition includes an enteric material that 
substantially prevents dissolution of the tablets, capsules or gel capsules in a mammalian 
stomach. In a preferred embodiment the enteric material is a coating. The coating can be 
acetate phthalate, propylene glycol, sorbitan monoleate, cellulose acetate trimellitate, 
hydroxy propyl methylcellulose phthalate or cellulose acetate phthalate. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a penetration enhancer. The penetration enhancer can be a bile salt or a fatty acid. 
The bile salt can be ursodeoxycholic acid, chenodeoxycholic acid, and salts thereof. The 
fatty acid can be capric acid, lauric acid, and salts thereof. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycol. In another 
i example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or triethyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 
5 iRNA agent and a delivery vehicle. In one embodiment, the iRNA agent is (a) is 19-25 

nucleotides long, preferably 21-23 nucleotides, (b) is complementary to an endogenous target 
RNA, and, optionally, (c) includes at least one 3' overhang 1-5 nucleotides long. 

In one embodiment, the delivery vehicle can deliver an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
!0 be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) to a cell by a topical route of 
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administration. The delivery vehicle can be microscopic vesicles. In one example the 
microscopic vesicles are liposomes. In a preferred embodiment the liposomes are cationic 
liposomes. In another example the microscopic vesicles are micelles. 

In one aspect, the invention features a method for making a pharmaceutical 
composition, the method including: (1) contacting an iRNA agent, e.g., a double-stranded 
iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be 
processed into a sRNA agent) with a amphipathic cationic lipid conjugate in the presence of 
a detergent; and (2) removing the detergent to form a iRNA agent and cationic lipid complex. 

In another aspect, the invention features a pharmaceutical composition produced by a 
method including: (1) contacting an iRNA agent, e.g., a double-stranded iRNA agent, or 
sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent) with a amphipathic cationic lipid conjugate in the presence of a detergent; and 
(2) removing the detergent to form a iRNA agent and cationic lipid complex. In one 
embodiment, the detergent is cholate, deoxycholate, lauryl sarcosine, octanoyl sucrose, 
CHAP S (3 - [(3-cholamidopropyl)-di-methylamine]-2-hydroxyl- 1 -propane), novel- p-D- 
glucopyranoside, lauryl dimethylamine oxide, or octylglucoside. In another embodiment, the 
amphipathic cationic lipid conjugate is biodegradable. In yet another embodiment the 
pharmaceutical composition includes a targeting ligand. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in an 
injectable dosage form. In one embodiment, the injectable dosage form of the 
pharmaceutical composition includes sterile aqueous solutions or dispersions and sterile 
powders. In a preferred embodiment the sterile solution can include a diluent such as water; 
saline solution; fixed oils, polyethylene glycols, glycerin, or propylene glycol. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in 
oral dosage form. In one embodiment, the oral dosage form is selected from the group 
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consisting of tablets, capsules and gel capsules. In another embodiment, the pharmaceutical 
composition includes an enteric material that substantially prevents dissolution of the tablets, 
capsules or gel capsules in a mammalian stomach. In a preferred embodiment the enteric 
material is a coating. The coating can be acetate phthalate, propylene glycol, sorbitan 
monoleate, cellulose acetate trimellitate, hydroxy propyl methyl cellulose phthalate or 
cellulose acetate phthalate. In one embodiment, the oral dosage form of the pharmaceutical 
composition includes a penetration enhancer, e.g., a penetration enhancer described herein. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycol. In another 
example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or triethyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
rectal dosage form. In one embodiment, the rectal dosage form is an enema. In another 
embodiment, the rectal dosage form is a suppository. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
vaginal dosage form. In one embodiment, the vaginal dosage form is a suppository. In 
another embodiment, the vaginal dosage form is a foam, cream, or gel. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
pulmonary or nasal dosage form. In one embodiment, the iRNA agent is incorporated into a 
particle, e.g., a macroparticle, e.g., a microsphere. The particle can be produced by spray 
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drying, lyophilization, evaporation, fluid bed drying, vacuum drying, or a combination 
thereof. The microsphere can be formulated as a suspension, a powder, or an implantable 
solid. 

In one aspect, the invention features a spray-dried iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) composition suitable for 
inhalation by a subject, including: (a) a therapeutically effective amount of a iRNA agent 
suitable for treating a condition in the subject by inhalation; (b) a pharmaceutically 
acceptable excipient selected from the group consisting of carbohydrates and amino acids; 
and (c) optionally, a dispersibility-enhancing amount of a physiologically-acceptable, water- 
soluble polypeptide. 

In one embodiment, the excipient is a carbohydrate. The carbohydrate can be 
selected from the group consisting of monosaccharides, disaccharides, trisaccharides, and 
polysaccharides. In a preferred embodiment the carbohydrate is a monosaccharide selected 
from the group consisting of dextrose, galactose, mannitol, D-mannose, sorbitol, and sorbose. 
In another preferred embodiment the carbohydrate is a disaccharide selected from the group 
consisting of lactose, maltose, sucrose, and trehalose. 

In another embodiment, the excipient is an amino acid. In one embodiment, the 
amino acid is a hydrophobic amino acid. In a preferred embodiment the hydrophobic amino 
acid is selected from the group consisting of alanine, isoleucine, leucine, methionine, 
phenylalanine, proline, tryptophan, and valine. In yet another embodiment the amino acid is a 
polar amino acid. In a preferred embodiment the amino acid is selected from the group 
consisting of arginine, histidine, lysine, cysteine, glycine, glutamine, serine, threonine, 
tyrosine, aspartic acid and glutamic acid. 

In one embodiment, the dispersibility-enhancing polypeptide is selected from the 
group consisting of human serum albumin, a-lactalbumin, trypsinogen, and polyalanine. 

In one embodiment, the spray-dried iRNA agent composition includes particles 
having a mass median diameter (MMD) of less than 10 microns. In another embodiment, 
the spray-dried iRNA agent composition includes particles having a mass median diameter oi 
less than 5 microns. In yet another embodiment the spray-dried iRNA agent composition 
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includes particles having a mass median aerodynamic diameter (MMAD) of less than 5 



microns. 



In certain other aspects, the invention provides kits that include a suitable container 
contai nin g a pharmaceutical formulation of an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof). In certain embodiments the individual 
components of the pharmaceutical formulation may be provided in one container. 
Alternatively, it may be desirable to provide the components of the pharmaceutical 
formulation separately in two or more containers, e.g., one container for an iRNA agent 
preparation, and at least another for a carrier compound. The kit may be packaged in a 
number of different configurations such as one or more containers in a single box. The 
different components can be combined, e.g., according to instructions provided with the kit. 
The components can be combined according to a method described herein, e.g., to prepare 
and administer a pharmaceutical composition. The kit can also include a delivery device. 

In another aspect, the invention features a device, e.g., an implantable device, wherein 
the device can dispense or administer a composition that includes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof), e.g., a iRNA agent that 
silences an endogenous transcript. In one embodiment, the device is coated with the 
composition. In another embodiment the iRNA agent is disposed within the device. In 
another embodiment, the device includes a mechanism to dispense a unit dose of the 
composition. In other embodiments the device releases the composition continuously, e.g., 
by diffusion. Exemplary devices include stents, catheters, pumps, artificial organs or organ 
components {e.g., artificial heart, a heart valve, etc.), and sutures. 

As used herein, the term "crystalline" describes a solid having the structure or 
characteristics of a crystal, i.e., particles of three-dimensional structure in which the plane 
faces intersect at definite angles and in which there is a regular internal structure. The 
compositions of the invention may have different crystalline forms. Crystalline forms can be 
prepared by a variety of methods, including, for example, spray drying. 
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As used herein, "specifically hybridizable" and "complementary" are terms which are 
used to indicate a sufficient degree of complementarity such that stable and specific binding 
occurs between a compound of the invention and a target RNA molecule. Specific binding 
requires a sufficient degree of complementarity to avoid non-specific binding of the 
5 oligomeric compound to non-target sequences under conditions in which specific binding is 
desired, i.e., under physiological conditions in the case of in vivo assays or therapeutic 
treatment, or in the case of in vitro assays, under conditions in which the assays are 
performed. The non-target sequences typically differ by at least 5 nucleotides. 

In one embodiment, an iRNA agent is "sufficiently complementary" to a target RNA, 
1 o e.g., a target mRNA, such that the iRNA agent silences production of protein encoded by the 
target mRNA. In another embodiment, the iRNA agent is "exactly complementary" to a 
target RNA, e.g., the target RNA and the iRNA agent anneal, preferably to form a hybrid 
made exclusively of Watson-Crick basepairs in the region of exact complementarity. A 
"sufficiently complementary" target RNA can include an internal region (e.g., of at least 10 
1 5 nucleotides) that is exactly complementary to a target RNA. Moreover, in some 

embodiments, the iRNA agent specifically discriminates a single-nucleotide difference. In 
this case, the iRNA agent only mediates RNAi if exact complementary is found in the region 
(e.g., within 7 nucleotides of) the single-nucleotide difference. 

As used herein, the term "oligonucleotide" refers to a nucleic acid molecule (RNA or 
20 DNA) preferably of length less than 100, 200, 300, or 400 nucleotides. 

Unless otherwise defined, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 
pertains. The materials, methods, and examples are illustrative only and not intended to be 
limiting. Although methods and materials similar or equivalent to those described herein can 
25 be used in the practice or testing of the present invention, useful methods and materials are 
described below. Other features and advantages of the invention will be apparent from the 
accompanying drawings and description, and from the claims. The contents of all references, 
pending patent applications and published patents, cited throughout this application are 
hereby expressly incorporated by reference. In case of conflict, the present specification, 
30 including definitions, will control. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a structural representation of base pairing in psuedocomplementary siRNA 2 . 
FIG. 2 is a schematic representation of dual targeting siRNAs designed to target the 
HCV genome. 

FIG. 3 is a schematic representation of psuedocomplementary, bifunctional siRNAs 
designed to target the HCV genome. 

FIG. 4 is a general synthetic scheme for incorporation of RRMS monomers into an 
oligonucleotide. 

FIG. 5 is a table of representative RRMS carriers. Panel 1 shows pyrroline-based 
RRMSs; panel 2 shows 3-hydroxyproline-based RRMSs; panel 3 shows piperidine-based 
RRMSs; panel 4 shows morpholine and piperazine-based RRMSs; and panel 5 shows 
decalin-based RRMSs. Rl is succinate or phosphoramidate and R2 is H or a conjugate 
ligand. 

FIG. 6A. is a graph depicting levels of luciferase mRNA in livers of CMV-Luc mice 
(Xanogen) following intervenous injection (iv) of buffer or siRNA into the tail vein. Each 
bar represents data from one mouse. RNA levels were quantified by QuantiGene Assay 
(Genospectra, Inc.; Fremont, CA)). The Y axis represents chemiluminescence values in 
counts per second (CPS). 

FIG. 6B. is a graph depicting levels of luciferase mRNA in livers of CMV-Luc mice 
(Xanogen). The values are averaged from the data depicted in FIG. XxxA. 

FIG. 7 is a graph depicting the pharmacokinetics of cholesterol-conjugated and 
unconjugated siRNA. The diamonds represent the amount of unconjugated 33 P-labeled 
siRNA (ALN-3000) in mouse plasma over time; the squares represent the amount of 
cholesterol-conjugated 33 P-labeled siRNA (ALN-3001) in mouse plasma over time. "LI 163" 
is equivalent to ALN3000; "LI 1 63Chol" is equivalent to ALN-3001 . 

DETAILED DESCRIPTION 

Double-stranded (dsRNA) directs the sequence-specific silencing of mRNA through a 
process known as RNA interference (RNAi). The process occurs in a wide variety of 
organisms, including mammals and other vertebrates. 
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It has been demonstrated that 21-23 nt fragments of dsRNA are sequence-specific 
mediators of RNA silencing, e.g., by causing RNA degradation. While not wishing to be 
bound by theory, it may be that a molecular signal, which may be merely the specific length 
of the fragments, present in these 21-23 nt fragments recruits cellular factors that mediate 
RNAi. Described herein are methods for preparing and administering these 21-23 nt 
fragments, and other iRNAs agents, and their use for specifically inactivating gene function. 
The use of iRNAs agents (or recombinant^ produced or chemically synthesized 
oligonucleotides of the same or similar nature) enables the targeting of specific mRNAs for 
silencing in mammalian cells. In addition, longer dsRNA agent fragments can also be used, 
e.g., as described below. 

Although, in mammalian cells, long dsRNAs can induce the interferon response 
which is frequently deleterious, sRNAs do not trigger the interferon response, at least not to 
an extent that is deleterious to the cell and host. In particular, the length of the iRNA agent 
strands in an sRNA agent can be less than 31 , 30, 28, 25, or 23 nt, e.g., sufficiently short to 
avoid inducing a deleterious interferon response. Thus, the administration of a composition 
of sRNA agent {e.g., formulated as described herein) to a mammalian cell can be used to 
silence expression of a target gene while circumventing the interferon response. Further, use 
of a discrete species of iRNA agent can be used to selectively target one allele of a target 
gene, e.g., in a subject heterozygous for the allele. 

Moreover, in one embodiment, a mammalian cell is treated with an iRNA agent that 
disrupts a component of the interferon response, e.g., double stranded RNA (dsRNA)- 
activated protein kinase PKR. Such a cell can be treated with a second iRNA agent that 
includes a sequence complementary to a target RNA and that has a length that might 
otherwise trigger the interferon response. 

In a typical embodiment, the subject is a mammal such as a cow, horse, mouse, rat, 
dog, pig, goat, or a primate. The subject can be a dairy mammal {e.g., a cow, or goat) or 
other farmed animal {e.g., a chicken, turkey, sheep, pig, fish, shrimp). In a much preferred 
embodiment, the subject is a human, e.g., a normal individual or an individual that has, is 
diagnosed with, or is predicted to have a disease or disorder. 

Further, because iRNA agent mediated silencing persists for several days after 
administering the iRNA agent composition, in many instances, it is possible to administer the 
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composition with a frequency of less than once per day, or, for some instances, only once 
the entire therapeutic regimen. For example, treatment of some cancer cells may be 
mediated by a single bolus administration, whereas a chronic viral infection may require 
regular administration, e.g., once per week or once per month. 

A number of exemplary routes of delivery are described that can be used to 
administer an iRNA agent to a subject. In addition, the iRNA agent can be formulated 
according to an exemplary method described herein. 



iRNA AGENT STRUCTURE 

Described herein are isolated iRNA agents, e.g., RNA molecules, (double-stranded; 
single-stranded) that mediate RNAi. The iRNA agents preferably mediate RNAi with 
respect to an endogenous gene of a subject or to a gene of a pathogen. 

An "RNA agent" as used herein, is an unmodified RNA, modified RNA, or 
nucleoside surrogate, all of which are defined herein (see, e.g., the section below entitled 
RNA Agents). While numerous modified RNAs and nucleoside surrogates are described, 
preferred examples include those which have greater resistance to nuclease degradation than 
do unmodified RNAs. Preferred examples include those which have a 2' sugar modification, 
a modification in a single strand overhang, preferably a 3' single strand overhang, or, 
particularly if single stranded, a 5' modification which includes one or more phosphate 
groups or one or more analogs of a phosphate group. 

An "iRNA agent" as used herein, is an RNA agent which can, or which can be 
cleaved into an RNA agent which can, down regulate the expression of a target gene, 
preferably an endogenous or pathogen target RNA. While not wishing to be bound by 
theory, an iRNA agent may act by one or more of a number of mechanisms, including post- 
transcriptional cleavage of a target mRNA sometimes referred to in the art as RNAi, or pre- 
transcriptional or pre-translational mechanisms. An iRNA agent can include a single strand 
or can include more than one strands, e.g., it can be a double stranded iRNA agent. If the 
iRNA agent is a single strand it is particularly preferred that it include a 5' modification 
which includes one or more phosphate groups or one or more analogs of a phosphate group. 

The iRNA agent should include a region of sufficient homology to the target gene, 
and be of sufficient length in terms of nucleotides, such that the iRNA agent, or a fragment 
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thereof, can mediate down regulation of the target gene. (For ease of exposition the term 
nucleotide or ribonucleotide is sometimes used herein in reference to one or more monomeric 
subunits of an RNA agent It will be understood herein that the usage of the term 
"ribonucleotide" or "nucleotide", herein can, in the case of a modified RNA or nucleotide 
surrogate, also refer to a modified nucleotide, or surrogate replacement moiety at one or more 
positions.) Thus, the iRNA agent is or includes a region which is at least partially, and in 
some embodiments fully, complementary to the target RNA. It is not necessary that there be 
perfect complementarity between the iRNA agent and the target, but the correspondence 
must be sufficient to enable the iRNA agent, or a cleavage product thereof, to direct sequence 
specific silencing, e.g., by RNAi cleavage of the target RNA, e.g., mRNA. 

Complementarity, or degree of homology with die target strand, is most critical in the 
antisense strand. While perfect complementarity, particularly in the antisense strand, is often 
desired some embodiments can include, particularly in the antisense strand, one or more but 
preferably 6, 5, 4, 3, 2, or fewer mismatches (with respect to the target RNA). The 
mismatches, particularly in the antisense strand, are most tolerated in the terminal regions 
and if present are preferably in a terminal region or regions, e.g., within 6, 5, 4, or 3 
nucleotides of the 5' and/or 3' terminus. The sense strand need only be sufficiently 
complementary with the antisense strand to maintain the over all double strand character of 
the molecule. 

As discussed elsewhere herein, an iRNA agent will often be modified or include 
nucleoside surrogates in addition to the RRMS. Single stranded regions of an iRNA agent 
will often be modified or include nucleoside surrogates, e.g., the unpaired region or regions 
of a hairpin structure, e.g., a region which links two complementary regions, can have 
modifications or nucleoside surrogates. Modification to stabilize one or more 3'- or 5'- 
terminus of an iRNA agent, e.g., against exonucleases, or to favor the antisense sRNA agent 
to enter into RISC are also favored. Modifications can include C3 (or C6, C7, C12) amino 
linkers, thiol linkers, carboxyl linkers, non-nucleotidic spacers (C3, C6, C9, C12, abasic, 
triethylene glycol, hexaethylene glycol), special biotin or fluorescein reagents that come as 
phosphoramidites and that have another DMT-protected hydroxyl group, allowing multiple 
couplings during RNA synthesis. 
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iRNA agents include: molecules that are long enough to trigger the interferon 
response (which can be cleaved by Dicer (Bernstein et al. 2001 . Nature, 409:363-366) and 
enter a RISC (RNAi-induced silencing complex)); and, molecules which are sufficiently 
short that they do not trigger the interferon response (which molecules can also be cleaved by 
Dicer and/or enter a RISC), e.g., molecules which are of a size which allows entry into a 
RISC, e.g., molecules which resemble Dicer-cleavage products. Molecules that are short 
enough that they do not trigger an interferon response are termed sRNA agents or shorter 
iRNA agents herein. "sRNA agent or shorter iRNA agent" as used herein, refers to an iRNA 
agent, e.g., a double stranded RNA agent or single strand agent, that is sufficiently short that 
it does not induce a deleterious interferon response in a human cell, e.g., it has a duplexed 
region of less than 60 but preferably less than 50, 40, or 30 nucleotide pairs. The sRNA 
agent, or a cleavage product thereof, can down regulate a target gene, e.g., by inducing RNAi 
with respect to a target RNA, preferably an endogenous or pathogen target RNA. 

Each strand of an sRNA agent can be equal to or less than 30, 25, 24, 23, 22, 21 , or 20 
nucleotides in length. The strand is preferably at least 19 nucleotides in length. For example, 
each strand can be between 21 and 25 nucleotides in length. Preferred sRNA agents have a 
duplex region of 17, 1 8, 19, 29, 21, 22, 23, 24, or 25 nucleotide pairs, and one or more 
overhangs, preferably one or two 3' overhangs, of 2- 3 nucleotides. 

In addition to homology to target RNA and the ability to down regulate a target gene, 
an iRNA agent will preferably have one or more of the following properties: 

(1) it will be of the Formula 1, 2, 3, or 4 set out in the RNA Agent section below; 

(2) if single stranded it will have a 5' modification which includes one or more 
phosphate groups or one or more analogs of a phosphate group; 

(3) it will, despite modifications, even to a very large number, or all of the 
nucleosides, have an antisense strand that can present bases (or modified bases) in the proper 
three dimensional framework so as to be able to form correct base pairing and form a duplex 
structure with a homologous target RNA which is sufficient to allow down regulation of the 
target, e.g., by cleavage of the target RNA; 

(4) it will, despite modifications, even to a very large number, or all of the 
nucleosides, still have "RNA-like" properties, i.e., it will possess the overall structural, 
chemical and physical properties of an RNA molecule, even though not exclusively, or even 
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partly, of ribonucleotide-based content. For example, an iRNA agent can contain, e.g., a 
sense and/or an antisense strand in which all of the nucleotide sugars contain e.g., T fluoro in 
place of T hydroxyl. This deoxyribonucleotide-containing agent can still be expected to 
exhibit RNA-like properties. While not wishing to be bound by theory, the electronegative 
fluorine prefers an axial orientation when attached to the C2' position of ribose. This spatial 
preference of fluorine can, in turn, force the sugars to adopt a Cy-endo pucker. This is the 
same puckering mode as observed in RNA molecules and gives rise to the RNA- 
characteristic A-family-type helix. Further, since fluorine is a good hydrogen bond acceptor, 
it can participate in the same hydrogen bonding interactions with water molecules that are 
known to stabilize RNA structures. (Generally, it is preferred that a modified moiety at the 
2' sugar position will be able to enter into H-bonding which is more characteristic of the OH 
moiety of a ribonucleotide than the H moiety of a deoxyribonucleotide. A prefen-ed iRNA 
agent will: exhibit a Cy-endo pucker in all, or at least 50, 75,80, 85, 90, or 95 % of its 
sugars; exhibit a Cy-endo pucker in a sufficient amount of its sugars that it can give rise to a 
the RNA-characteristic A-family-type helix; will have no more than 20, 10, 5, 4, 3, 2, orl 
sugar which is not a Cy-endo pucker structure. These limitations are particularly preferably 
in the antisense strand; 

(5) regardless of the nature of the modification, and even though the RNA agent 
can contain deoxynucleotides or modified deoxynucleotides, particularly in overhang or 
other single strand regions, it is preferred that DNA molecules, or any molecule in which 
more than 50, 60, or 70 % of the nucleotides in the molecule, or more than 50, 60, or 70 % of 
the nucleotides in a duplexed region are deoxyribonucleotides, or modified 
deoxyribonucleotides which are deoxy at the 2' position, are excluded from the definition of 
RNA agent. 

A "single strand iRNA agent" as used herein, is an iRNA agent which is made up of a 
single molecule. It may include a duplexed region, formed by intra-strand pairing, e.g., it 
may be, or include, a hairpin or pan-handle structure. Single strand iRNA agents are 
preferably antisense with regard to the target molecule. In preferred embodiments single 
strand iRNA agents are 5' phosphorylated or include a phosphoryl analog at the 5' prime 
terminus. 5'-phosphate modifications include those which are compatible with RISC 
mediated gene silencing. Suitable modifications include: ^-monophosphate ((HO)2(0)P-0- 



WO 2004/080406 PCT/US2004/007070 

5'); 5'-diphosphate ((HO)2(0)P-0-P(HO)(0)-0-5'); 5'-triphosphate ((HO)2(0)P-0- 
(HO)(0)P-0-P(HO)(0)-0-5"); 5'-guanosine cap (7-methylated or non-methylated) (7m-G-0- 
5 , -(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'); 5'-adenosine cap (Appp), and any modified or 
unmodified nucleotide cap structure (N-0-5'-(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'); 5'- 
monothiophosphate (phosphorothioate; (HO)2(S)P-0-5'); S'-monodithiophosphate 
(phosphorodithioate; (HO)(HS)(S)P-0-5'), S'-phosphorothiolate ((HO)2(0)P-S-5'); any 
additional combination of oxygen/sulfur replaced monophosphate, diphosphate and 
triphosphates (e.g. 5'-alpha-thiotriphosphate, S'-gamma-tMotriphosphate, etc.), 5'- 
phosphoramidates ((HO)2(0)P-NH-5' 3 (HO)(NH2)(0)P-0-5') 5 5'-alkylphosphonates 
(R=alkyl=methyl, ethyl, isopropyl, propyl, etc., e.g. RP(OH)(0)-0-5'-, (OH)2(0)P-5'-CH2-), 
5'-alkyletherphosphonates (R=alkylether=methoxymethyl (MeOCH2-), ethoxymethyl, etc., 
e.g. RP(OH)(0)-0-5'-). (These modifications can also be used with the antisense strand of a 
double stranded iRNA.) 

A single strand iRNA agent should be sufficiently long that it can enter the RISC and 
participate in RISC mediated cleavage of a target mRNA. A single strand iRNA agent is at 
least 14, and more preferably at least 15, 20, 25, 29, 35, 40, or 50nucleotides in length. It is 
preferably less than 200, 100, or 60 nucleotides in length. 

Hairpin iRNA agents will have a duplex region equal to or at least 17, 1 8, 19, 29, 21 , 
22, 23, 24, or 25 nucleotide pahs. The duplex region will preferably be equal to or less than 
200, 100, or 50, in length. Preferred ranges for the duplex region are 15-30, 17 to 23, 19 to 
23, and 19 to 21 nucleotides pairs in length. The hairpin will preferably have a single strand 
overhang or terminal unpaired region, preferably the 3', and preferably of the antisense side 
of the hairpin. Preferred overhangs are 2-3 nucleotides in length. 

A "double stranded (ds) iRNA agent" as used herein, is an iRNA agent which 
includes more than one, and preferably two, strands in which interchain hybridization can 
form a region of duplex structure. 

The antisense strand of a double stranded iRNA agent should be equal to or at least, 
14, 15, 16 17, 18, 19, 25, 29, 40, or 60 nucleotides in length. It should be equal to or less 
than 200, 100, or 50, nucleotides in length. Preferred ranges are 17 to 25, 19 to 23, and 19 
to21 nucleotides in length. 
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The sense strand of a double stranded iRNA agent should be equal to or at least 14, 
15, 16 17, 18, 19, 25, 29, 40, or 60 nucleotides in length. It should be equal to or less than 
200, 100, or 50, nucleotides in length. Preferred ranges are 17 to 25, 19 to 23, and 19 to21 
nucleotides in length. 

The double strand portion of a double stranded iRNA agent should be equal to or at 
least, 14, 15, 16 17, 18, 19, 20, 21, 22, 23, 24, 25, 29, 40, or 60 nucleotide pairs in length. It 
should be equal to or less than 200, 100, or 50, nucleotides pairs in length. Preferred ranges 
are 15-30, 17 to 23, 19 to 23, and 19 to 21 nucleotides pairs in length. 

In many embodiments, the ds iRNA agent is sufficiently large that it can be cleaved 
by an endogenous molecule, e.g., by Dicer, to produce smaller ds iRNA agents, e.g., sRNAs 
agents 

It may be desirable to modify one or both of the antisense and sense strands of a 
double strand iRNA agent. In some cases they will have the same modification or the same 
class of modification but in other cases the sense and antisense strand will have different 
modifications, e.g., in some cases it is desirable to modify only the sense strand. It may be 
desirable to modify only the sense strand, e.g., to inactivate it, e.g., the sense strand can be 
modified in order to inactivate the sense strand and prevent formation of an active 
sRNA/protein or RISC. This can be accomplished by a modification which prevents 5'- 
phosphorylation of the sense strand, e.g., by modification with a S'-O-methyl ribonucleotide 
(see Nykanen et ah, (2001) ATP requirements and small interfering RNA structure in the 
RNA interference pathway. Cell 107, 309-321.) Other modifications which prevent 
phosphorylation can also be used, e.g., simply substituting the 5'-OH by H rather than O-Me. 
Alternatively, a large bulky group may be added to the 5'-phosphate turning it into a 
phosphodiester linkage, though this may be less desirable as phosphodiesterases can cleave 
such a linkage and release a functional sRNA 5'-end. Antisense strand modifications include 
5' phosphorylation as well as any of the other 5' modifications discussed herein, particularly 
the 5' modifications discussed above in the section on single stranded iRNA molecules. 

It is preferred that the sense and antisense strands be chosen such that the ds iRNA 
agent includes a single strand or unpaired region at one or both ends of the molecule. Thus, a 
ds iRNA agent contains sense and antisense strands, preferable paired to contain an 
overhang, e.g., one or two 5' or 3' overhangs but preferably a 3' overhang of 2-3 
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nucleotides. Most embodiments will have a 3' overhang. Preferred sRNA agents will have 
single-stranded overhangs, preferably 3' overhangs, of 1 or preferably 2 or 3 nucleotides in 
length at each end. The overhangs can be the result of one strand being longer than the other, 
or the result of two strands of the same length being staggered. 5' ends are preferably 
phosphorylated. 

Preferred lengths for the duplexed region is between 1 5 and 30, most preferably 18, 
19, 20, 21, 22, and 23 nucleotides in length, e.g., in the sRNA agent range discussed above. 
sRNA agents can resemble in length and structure the natural Dicer processed products from 
long dsRNAs. Embodiments in which the two strands of the sRNA agent are linked, e.g., 
covalently linked are also included. Hairpin, or other single strand structures which provide 
the required double stranded region, and preferably a 3' overhang are also within the 
invention. 

The isolated iRNA agents described herein, including ds iRNA agents and sRNA 
agents can mediate silencing of a target RNA, e.g., mRNA, e.g., a transcript of a gene that 
encodes a protein. For convenience, such mRNA is also referred to herein as mRNA to be 
silenced. Such a gene is also referred to as a target gene. In general, the RNA to be silenced 
is an endogenous gene or a pathogen gene. In addition, RNAs other than mRNA, e.g., 
tRNAs, and viral RNAs, can also be targeted. 

As used herein, the phrase "mediates RNAi" refers to the ability to silence, in a 
sequence specific manner, a target RNA. While not wishing to be bound by theory, it is 
believed that silencing uses the RNAi machinery or process and a guide RNA, e.g., an sRNA 
agent of 21 to 23 nucleotides. 

As used herein, "specifically hybridizable" and "complementary" are terms which are 
used to indicate a sufficient degree of complementarity such that stable and specific binding 
occurs between a compound of the invention and a target RNA molecule. Specific binding 
requires a sufficient degree of complementarity to avoid non-specific binding of the 
oligomeric compound to non-target sequences under conditions in which specific binding is 
desired, i.e., under physiological conditions in the case of in vivo assays or therapeutic 
treatment, or in the case of in vitro assays, under conditions in which the assays are 
performed. The non-target sequences typically differ by at least 5 nucleotides. 
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In one embodiment, an iRNA agent is "sufficiently complementary" to a target RNA, 
e.g., a target mRNA, such that the iRNA agent silences production of protein encoded by the 
target mRNA. In another embodiment, the iRNA agent is "exactly complementary" 
(excluding the RRMS containing subunit(s))to a target RNA, e.g., the target RNA and the 
iRNA agent anneal, preferably to form a hybrid made exclusively of Watson-Crick basepairs 
in the region of exact complementarity. A "sufficiently complementary" target RNA can 
include an internal region {e.g., of at least 10 nucleotides) that is exactly complementary to a 
target RNA. Moreover, in some embodiments, the iRNA agent specifically discriminates a 
single-nucleotide difference. In this case, the iRNA agent only mediates RNAi if exact 
complementary is found in the region (e.g., within 7 nucleotides of) the single-nucleotide 
difference. 

As used herein, the term "oligonucleotide" refers to a nucleic acid molecule (RNA or 
DNA) preferably of length less than 100, 200, 300, or 400 nucleotides. 

RNA agents discussed herein include otherwise unmodified RNA as well as RNA 
which have been modified, e.g., to improve efficacy, and polymers of nucleoside surrogates. 
Unmodified RNA refers to a molecule in which the components of the nucleic acid, namely 
sugars, bases, and phosphate moieties, are the same or essentially the same as that which 
occur in nature, preferably as occur naturally in the human body. The art has referred to rare 
or unusual, but naturally occurring, RNAs as modified RNAs, see, e.g., Limbach et al, 
(1994) Summary: the modified nucleosides of RNA, Nucleic Acids Res. 22: 2183-2196. 
Such rare or unusual RNAs, often termed modified RNAs (apparently because the are 
typically the result of a post transcriptionally modification) are within the term unmodified 
RNA, as used herein. Modified RNA as used herein refers to a molecule in which one or 
more of the components of the nucleic acid, namely sugars, bases, and phosphate moieties, 
are different from that which occur in nature, preferably different from that which occurs in 
the human body. While they are referred to as modified "RNAs," they will of course, 
because of the modification, include molecules which are not RNAs. Nucleoside surrogates 
are molecules in which the ribophosphate backbone is replaced with a non-ribophosphate 
construct that allows the bases to the presented in the correct spatial relationship such that 
hybridization is substantially similar to what is seen with a ribophosphate backbone, e.g., 
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non-charged mimics of the ribophosphate backbone. Examples of all of the above are 
discussed herein. 

Much of the discussion below refers to single strand molecules. In many 
embodiments of the invention a double stranded iRNA agent, e.g., a partially double stranded 
iRNA agent, is required or preferred. Thus, it is understood that that double stranded 
structures {e.g. where two separate molecules are contacted to form the double stranded 
region or where the double stranded region is formed by intramolecular pairing {e.g., a 
hairpin structure)) made of the single stranded structures described below are within the 
invention. Preferred lengths are described elsewhere herein. 

As nucleic acids are polymers of subunits or monomers, many of the modifications 
described below occur at a position which is repeated within a nucleic acid, e.g., a 
modification of a base, or a phosphate moiety, or the a non-linking O of a phosphate moiety. 
In some cases the modification will occur at all of the subject positions in the nucleic acid but 
in many, and infect in most cases it will not. By way of example, a modification may only 
occur at a 3' or 5' terminal position, may only occur in a terminal regions, e.g. at a position 
on a terminal nucleotide or in the last 2, 3, 4, 5, or 10 nucleotides of a strand. A modification 
may occur in a double strand region, a single strand region, or in both. A modification may 
occur only in the double strand region of an RNA or may only occur in a single strand region 
of an RNA. E.g., a phosphorothioate modification at a non-linking O position may only 
occur at one or both termini, may only occur in a terminal regions, e.g., at a position on a 
terminal nucleotide or in the last 2, 3, 4, 5, or 10 nucleotides of a strand, or may occur in 
double strand and single strand regions, particularly at termini. The 5' end or ends can be 
phosphorylated. 

In some embodiments it is particularly preferred, e.g., to enhance stability, to include 
particular bases in overhangs, or to include modified nucleotides or nucleotide surrogates, in 
single strand overhangs, e.g., in a 5' or 3' overhang, or in both. E.g., it can be desirable to 
include purine nucleotides in overhangs. In some embodiments all or some of the bases in a 
3' or 5' overhang will be modified, e.g., with a modification described herein. Modifications 
can include, e.g., the use of modifications at the T OH group of the ribose sugar, e.g., the use 
of deoxyribonucleotides, e.g., deoxythymidine, instead of ribonucleotides, and modifications 
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in the phosphate group, e.g., phosphothioate modifications. Overhangs need not 
homologous with the target sequence. 

Modifications and nucleotide surrogates are discussed below. 




FORMULA 1 



The scaffold presented above in Formula 1 represents a portion of a ribonucleic acid. 
The basic components are the ribose sugar, the base, the terminal phosphates, and phosphate 
internucleotide linkers. Where the bases are naturally occurring bases, e.g., adenine, uracil, 
guanine or cytosine, the sugars are the unmodified 2' hydroxyl ribose sugar (as depicted) and 
W, X, Y, and Z are all O, Formula 1 represents a naturally occurring unmodified 
oligoribonucleotide . 

Umnodified oligoribonucleotides may be less than optimal in some applications, e.g., 
unmodified oligoribonucleotides can be prone to degradation by e.g., cellular nucleases. 
Nucleases can hydrolyze nucleic acid phosphodiester bonds. However, chemical 
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modifications to one or more of the above RNA components can confer improved properties, 
and, e.g., can render oligoribonucleotides more stable to nucleases. Umodified 
oligoribonucleotides may also be less than optimal in terms of offering tethering points for 
attaching ligands or other moieties to an iRNA agent. 

Modified nucleic acids and nucleotide surrogates can include one or more of: 

(i) alteration, e.g., replacement, of one or both of the non-linking (X and Y) 
phosphate oxygens and/or of one or more of the linking (W and Z) phosphate oxygens 
(When the phosphate is in the terminal position, one of the positions W or Z will not link the 
phosphate to an additional element in a naturally occurring ribonucleic acid. However, for 
simplicity of terminology, except where otherwise noted, the W position at the 5' end of a 
nucleic acid and the terminal Z position at the 3' end of a nucleic acid, are within the term 
"linking phosphate oxygens" as used herein.); 

(ii) alteration, e.g., replacement, of a constituent of the ribose sugar, e.g., of the 2' 
hydroxyl on the ribose sugar, or wholesale replacement of the ribose sugar with a structure 
other than ribose, e.g., as described herein; 

(iii) wholesale replacement of the phosphate moiety (bracket I) with "dephospho" 

linkers; 

(iv) modification or replacement of a naturally occurring base; 

(v) replacement or modification of the ribose-phosphate backbone (bracket IT); 

(vi) modification of the 3' end or 5' end of the RNA, e.g., removal, modification or 
replacement of a terminal phosphate group or conjugation of a moiety, e.g. a fluorescently 
labeled moiety, to either the 3' or 5' end of RNA. 

The terms replacement, modification, alteration, and the like, as used in this context, 
do not imply any process limitation, e.g., modification does not mean that one must start with 
a reference or naturally occurring ribonucleic acid and modify it to produce a modified 
ribonucleic acid bur rather modified simply indicates a difference from a naturally occurring 
molecule. 

It is understood that the actual electronic structure of some chemical entities cannot 
be adequately represented by only one canonical form (i.e. Lewis structure). While not 
wishing to be bound by theory, the actual structure can instead be some hybrid or weighted 
average of two or more canonical forms, known collectively as resonance forms or 
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structures. Resonance structures are not discrete chemical entities and exist only on paper. 
They differ from one another only in the placement or "localization" of the bonding and 
nonbonding electrons for a particular chemical entity. It can be possible for one resonance 
structure to contribute to a greater extent to the hybrid than the others. Thus, the written and 
graphical descriptions of the embodiments of the present invention are made in terms of what 
the art recognizes as the predominant resonance form for a particular species. For example, 
any phosphoroamidate (replacement of a nonlinking oxygen with nitrogen) would be 
represented by X = O and Y = N in the above figure. 

Specific modifications are discussed in more detail below. 

The Phosphate Group 

The phosphate group is a negatively charged species. The charge is distributed 
equally over the two non-linking oxygen atoms (i.e., X and Y in Formula 1 above). However, 
the phosphate group can be modified by replacing one of the oxygens with a different 
substituent. One result of this modification to RNA phosphate backbones can be increased 
resistance of the oligoribonucleotide to nucleolytic breakdown. Thus while not wishing to be 
bound by theory, it can be desirable in some embodiments to introduce alterations which 
result in either an uncharged linker or a charged linker with asymmetrical charge 
distribution. 

Examples of modified phosphate groups include phosphorothioate, 
phosphoroselenates, borano phosphates, borano phosphate esters, hydrogen phosphonates, 
phosphoroamidates, alkyl or aryl phosphonates and phosphotriesters. Phosphorodithioates 
have both non-linking oxygens replaced by sulfur. Unlike the situation where only one of X 
or Y is altered, the phosphorus center in the phosphorodithioates is achiral which precludes 
the formation of oligoribonucleotides diastereomers. Diastereomer formation can result in a 
preparation in which the individual diastereomers exhibit varying resistance to nucleases. 
Further, the hybridization affinity of RNA containing chiral phosphate groups can be lower 
relative to the corresponding unmodified RNA species. Thus, while not wishing to be bound 
by theory, modifications to both X and Y which eliminate the chiral center, e.g. 
phosphorodithioate formation, may be desirable in that they cannot produce diastereomer 
mixtures. Thus, X can be any one of S, Se, B, C, H, N, or OR (R is alkyl or aryl). Thus Y 
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can be any one of S, Se, B, C, H, N, or OR (R is alkyl or aryl). Replacement of X anoVor Y 
with sulfur is preferred. 

The phosphate linker can also be modified by replacement of a linking oxygen (i.e., 
W or Z in Formula 1) with nitrogen (bridged phosphoroamidates), sulfur (bridged 
phosphorothioates) and carbon (bridged methylenephosphonates). The replacement can 
occur at a terminal oxygen (position W (3') or position Z (5'). Replacement of W with 
carbon or Z with nitrogen is preferred. 

Candidate agents can be evaluated for suitability as described below. 

The Sugar Group 

A modified RNA can include modification of all or some of the sugar groups of the 
ribonucleic acid. E.g., the 2' hydroxyl group (OH) can be modified or replaced with a 
number of different "oxy" or "deoxy" substituents. While not being bound by theory, 
enhanced stability is expected since the hydroxyl can no longer be deprotonated to form a 2' 
alkoxide ion. The 2' alkoxide can catalyze degradation by intramolecular nucleophilic attack 
on the linker phosphorus atom. Again, while not wishing to be bound by theory, it can be 
desirable to some embodiments to introduce alterations in which alkoxide formation at the 2' 
position is not possible/ 

Examples of "oxy"-2' hydroxyl group modifications include alkoxy or aryloxy (OR, 
e.g., R = H, alkyl, cycloalkyl, aryl, aralkyl, heteroaryl or sugar); polyethyleneglycols (PEG), 
0(CH 2 CH 2 0)nCH 2 CH20R; "locked" nucleic acids (LNA) in which the 2' hydroxyl is 
connected, e.g., by a methylene bridge, to the 4' carbon of the same ribose sugar; O-AMLME 
(AMINE = NH 2 ; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl 
amino, or diheteroaryl amino, ethylene diamine, polyamino) and aminoalkoxy, 
0(CH 2 ) n AMINE, (e.g., AMINE = NH 2 ; alkylamino, dialkylamino, heterocyclyl, arylamino, 
diaryl amino, heteroaryl amino, or diheteroaryl amino, ethylene diamine, polyamino). It is 
noteworthy that oligonucleotides containing only the methoxyethyl group (MOE), 
(OCH 2 CH 2 OCH 3 , a PEG derivative), exhibit nuclease stabilities comparable to those 
modified with the robust phosphorothioate modification. 

"Deoxy" modifications include hydrogen (i.e. deoxyribose sugars, which are of 
particular relevance to the overhang portions of partially ds RNA); halo (e.g., fiuoro); amino 
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(e.g. NH 2 ; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl 
amino, diheteroaryl amino, or amino acid); NH(CH 2 CH 2 NH) n CH 2 CH 2 -AMINE (AMINE = 
NH 2 ; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino,or 
diheteroaryl amino), -NHC(0)R (R = alkyl, cycloalkyl, aryl, aralkyl, heteroaryl or sugar), 
cyano; mercapto; alkyl-thio-alkyl; thioalkoxy; and alkyl, cycloalkyl, aryl, alkenyl and 
alkynyl, which may be optionally substituted with e.g., an amino functionality. Preferred 
substitutents are 2'-metho X yethyl, 2'-OCH3, 2'-0-allyl, 2'-C- allyl, and 2'-fluoro. 

The sugar group can also contain one or more carbons that possess the opposite 
stereochemical configuration than that of the corresponding carbon in ribose. Thus, a 
modified RNA can include nucleotides containing e.g., arabinose, as the sugar. 

Modified RNA's can also include "abasic" sugars, which lack a nucleobase at C-l'. 
These abasic sugars can also be further contain modifications at one or more of the 
constituent sugar atoms. 

To maximize nuclease resistance, the 2' modifications can be used in combination 
with one or more phosphate linker modifications (e.g., phosphorothioate). The so-called 
"chimeric" oligonucleotides are those that contain two or more different modifications. 

The modificaton can also entail the wholesale replacement of a ribose structure with 
another entity at one or more sites in the iRNA agent These modifications are described in 
section entitled Ribose Replacements for RRMSs. 

Candidate modifications can be evaluated as described below. 

Replacement of the Phosphate Group 

The phosphate group can be replaced by non-phosphorus containing connectors (cf. 
Bracket I in Formula 1 above). While not wishing to be bound by theory, it is believed that 
since the charged phosphodiester group is the reaction center in nucleolytic degradation, its 
i replacement with neutral structural mimics should impart enhanced nuclease stability. 

Again, while not wishing to be bound by theory, it can be desirable, in some embodiment, to 
introduce alterations in which the charged phosphate group is replaced by a neutral moiety. 

Examples of moieties which can replace the phosphate group include siloxane, 
carbonate, carboxymethyl, carbamate, amide, thioether, ethylene oxide linker, sulfonate, 
o sulfonamide, thioformacetal, formacetal, oxime, methyleneimino, methylenemethylimino, 
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methylenehydrazo, methylenedimethylhydrazo and methyleneoxymethylimino. Preferred 
replacements include the methylenecarbonylamino and methylenemethylimino groups. 
Candidate modifications can be evaluated as described below. 

Replacement of Ribophosohate Backbone 

Oligonucleotide- mimicking scaffolds can also be constructed wherein the phosphate 
linker and ribose sugar are replaced by nuclease resistant nucleoside or nucleotide surrogates 
(see Bracket II of Formula 1 above). While not wishing to be bound by theory, it is believed 
that the absence of a repetitively charged backbone diminishes binding to proteins that 
recognize polyanions (e.g. nucleases). Again, while not wishing to be bound by theory, it 
can be desirable in some embodiment, to introduce alterations in which the bases are tethered 
by a neutral surrogate backbone. 

Examples include the mophilino, cyclobutyl, pyrrolidine and peptide nucleic acid 
(PNA) nucleoside surrogates. A preferred surrogate is a PNA surrogate. 

Candidate modifications can be evaluated as described below. 

Terminal Modifications 

The 3' and 5' ends of an oligonucleotide can be modified. Such modifications can be 
at the 3' end, 5' end or both ends of the molecule. They can include modification or 
replacement of an entire terminal phosphate or of one or more of the atoms of the phosphate 
group. E.g., the 3' and 5' ends of an oligonucleotide can be conjugated to other functional 
molecular entities such as labeling moieties, e.g., fluorophores (e.g., pyrene, TAMRA, 
fluorescein, Cy3 or Cy5 dyes) or protecting groups (based e.g., on sulfur, silicon, boron or 
ester). The functional molecular entities can be attached to the sugar through a phosphate 
group and/or a spacer. The terminal atom of the spacer can connect to or replace the linking 
atom of the phosphate group or the C-3' or C-5' O, N, S or C group of the sugar. 
Alternatively, the spacer can connect to or replace the terminal atom of a nucleotide 
surrogate (e.g., PNAs). These spacers or linkers can include e.g., -(CH 2 ) n -, -(CH 2 )„N-, - 
(CH 2 ) n O-, -(CH 2 )„S-, 0(CH 2 CH 2 0) n CH 2 CH 2 OH (e.g., n = 3 or 6), abasic sugars, amide, 
carboxy, amine, oxyamine, oxyimine, thioether, disulfide, thiourea, sulfonamide, or 
morpholino, or biotin and fluorescein reagents. When a spacer/phosphate-functional 
molecular entity-spacer/phosphate array is interposed between two strands of iRNA agents, 
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this array can substitute for a hairpin RNA loop in a hairpin-type RNA agent The 3 ' end can 
be an -OH group. While not wishing to be bound by theory, it is believed that conjugation of 
certain moieties can improve transport, hybridization, and specificity properties. Again, 
while not wishing to be bound by theory, it may be desirable to introduce terminal alterations 
that improve nuclease resistance. Other examples of terminal modifications include dyes, 
intercalating agents (e.g. acridines), cross-linkers (e.g. psoralene, mitomycin C), porphyrins 
(TPPC4, texaphyrin, Sapphyrin), polycyclic aromatic hydrocarbons (e.g., phenazine, 
dihydrophenazine), artificial endonucleases (e.g. EDTA), lipophilic carriers (e.g., cholesterol, 
cholic acid, adamantane acetic acid, 1-pyrene butyric acid, dihydrotestosterone, 1,3-Bis- 
0(hexadecyl)glycerol, geranyloxyhexyl group, hexadecylglycerol, borneol, menthol, 1,3- 
propanediol, heptadecyl group, palmitic acid, myristic acid,03-(oleoyl)lithocholic acid, 03- 
(oleoyl)cholenic acid, dimethoxytrityl, or phenoxazine)and peptide conjugates (e.g., 
antennapedia peptide, Tat peptide), alkylating agents, phosphate, amino, mercapto, PEG 
(e.g., PEG-40K), MPEG, [MPEG] 2 , polyamino, alkyl, substituted alkyl, radiolabeled 
markers, enzymes, haptens (e.g. biotin), transport/absorption facilitators (e.g., aspirin, 
vitamin E, folic acid), synthetic ribonucleases (e.g., imidazole, bisimidazole, histamine, 
imidazole clusters, acridine-imidazole conjugates, Eu3+ complexes of tetraazamacrocycles). 

Terminal modifications can be added for a number of reasons, including as discussed 
elsewhere herein to modulate activity or to modulate resistance to degradation. Terminal 
modifications useful for modulating activity include modification of the 5' end with 
phosphate or phosphate analogs. E.g., in preferred embodiments iRNA agents, especially 
antisense strands, are 5' phosphorylated or include a phosphoryl analog at the 5' prime 
terminus. 5'-phosphate modifications include those which are compatible with RISC 
mediated gene silencing. Suitable modifications include: 5'-monophosphate ((HO)2(0)P-0- 
5'); 5'-diphosphate ((H0)2(0)P-0-P(H0)(0)-0-5'); S'-triphosphate ((HO)2(0)P-0- 
(HO)(0)P-0-P(HO)(0)-0-5'); 5'-guanosine cap (7-methylated or non-methylated) (7m-G-0- 
5'-(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'); 5'-adenosine cap (Appp), and any modified or 
unmodified nucleotide cap structure (N-0-5'-(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'); 5'- 
monothiophosphate (phosphorothioate; (HO)2(S)P-0-5'); 5'-monodithiophosphate 
(phosphorodithioate; (HO)(HS)(S)P-0-5')> 5'-phosphorothiolate ((HO)2(0)P-S-5'); any 
additional combination of oxgen/sulfur replaced monophosphate, diphosphate and 
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triphosphates {e.g. S'-alpha-thiotriphosphate, 5'-gamma-thiotriphosphate, etc.), 5'- 
phosphoramidates ((HO)2(0)P-NH-5', (HO)^)^)?^^ 1 ), 5'-allqdphosphonates 
(R=alkyl=methyl, ethyl, isopropyl, propyl, etc., e.g. RPCOHXCO-O-S 1 -, (OH)2(0)P-5'-CH2-), 
5'-alkyletherphosphonates (R=alkylether=methoxyme1hyl (MeOCH2-), ethoxymethyl, etc., 
e.g. PvP(OH)(0)-0-5'-). 

Terminal modifications useful for increasing resistance to degradation include 
Terminal modifications can also be useful for monitoring distribution, and in such 
cases the preferred groups to be added include fluorophores, e.g., fluorscein or an Alexa dye, 
e.g., Alexa 488. Terminal modifications can also be useful for enhancing uptake, useful 
modifications for this include cholesterol. Terminal modifications can also be useful for 
cross-linking an RNA agent to another moiety; modifications useful for this include 
mitomycin C. 

Candidate modifications can be evaluated as described below. 
The Bases 

Adenine, guanine, cytosine and uracil are the most common bases found in RNA. 
These bases can be modified or replaced to provide RNA's having improved properties. 
E.g., nuclease resistant oligoribonucleotides can be prepared with these bases or with 
synthetic and natural nucleobases {e.g., inosine, thymine, xanthine, hypoxanthine, 
nubularine, isoguanisine, or tubercidine) and any one of the above modifications. 
Alternatively, substituted or modified analogs of any of the above bases, e.g., "unusual 
bases" and "universal bases," can be employed. Examples include without limitation 2- 
aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and 
other alkyl derivatives of adenine and guanine, 5-halouracil and cytosine, 5-propynyl uracil 
and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil (pseudouracil), 4-thiouracil, 5- 
halouracil, 5-(2-aminopropyl)uracil, 5-amino allyl uracil, 8-halo 5 amino, thiol, thioalkyl, 
hydroxyl and other 8-substituted adenines and guanines, 5-trifluoromethyl and other 5- 
substituted uracils and cytosines, 7-methylguanine, 5-substituted pyrimidines, 6- 
azapyrimidines and N-2, N-6 and 0-6 substituted purines, including 2-aminopropyladenine, 
5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3-deaza-5-azacytosine, 2- 
aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,7-deazaadenine, N6, N6- 
dimethyladenine, 2,6-diaminopurine, 5-amino-allyl-uracil,N3-methylmacil, substituted 
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1,2,4-triazoles, 2-pyridinone, 5-nitroindole, 3-mtropyrrole, 5-methoxyuracil, uracil-5- 
oxyacetic acid, 5-methoxycarbonylmethyluracil, 5-methyl-2-1hiouracil, 5- 
methoxycarbonylmethyl-2-thiouracil, 5-methylaminomethyl-2-thioui-acil, 3-(3-amino- 
3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, ^-acetyl cytosine, 2- 
thiocytosine, N6-memyladenine, N6-isopentyladenine, 2-methyltMo-N6-isopentenyladenine, 
N-methylguanines, or O-alkylated bases. Further purines and pyrimidines include those 
disclosed in U.S. Pat. No. 3,687,808, those disclosed in the Concise Encyclopedia Of 
Polymer Science And Engineering, pages 858-859, Kroschwitz, J . I., ed. John Wiley & Sons, 

1990, and those disclosed by Englisch et al, Angewandte Chemie, International Edition, 

1991,30,613. 

Generally, base changes are less preferred for promoting stability, but they can be 
useful for other reasons, e.g., some, e.g., 2,6-diaminopurine and 2 amino purine, are 
fluorescent. Modified bases can reduce target specificity. This should be taken into 
consideration in the design of iRNA agents. 

Candidate modifications can be evaluated as described below. 

Evaluation of Candidate RNA's 

One can evaluate a candidate KNA agent, e.g., a modified RNA, for a selected 
property by exposing the agent or modified molecule and a control molecule to the 
appropriate conditions and evaluating for the presence of the selected property. For example, 
resistance to a degradent can be evaluated as follows. A candidate modified RNA (and 
preferably a control molecule, usually the unmodified form) can be exposed to degradative 
conditions, e.g., exposed to a milieu, which includes a degradative agent, e.g., a nuclease. 
E.g., one can use a biological sample, e.g., one that is similar to a milieu, which might be 
encountered, in therapeutic use, e.g., blood or a cellular fraction, e.g., a cell-free homogenate 
or disrupted cells. The candidate and control could then be evaluated for resistance to 
degradation by any of a number of approaches. For example, the candidate and control could 
be labeled, preferably prior to exposure, with, e.g., a radioactive or enzymatic label, or a 
fluorescent label, such as Cy3 or Cy5. Control and modified RNA's can be incubated with 
the degradative agent, and optionally a control, e.g., an inactivated, e.g., heat inactivated, 
degradative agent. A physical parameter, e.g., size, of the modified and control molecules 
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are then determined. They can be determined by a physical method, e.g., by polyacrylamide 
gel electrophoresis or a sizing column, to assess whether the molecule has maintained its 
original length, or assessed functionally. Alternatively, Northern blot analysis can be used to 
assay the length of an unlabeled modified molecule. 

A functional assay can also be used to evaluate the candidate agent. A functional 
assay can be applied initially or after an earlier non-functional assay, {e.g., assay for 
resistance to degradation) to determine if the modification alters the ability of the molecule to 
silence gene expression. For example, a cell, e.g., a mammalian cell, such as a mouse or 
human cell, can be co-transfected with a plasmid expressing a fluorescent protein, e.g., GFP, 
and a candidate RNA agent homologous to the transcript encoding the fluorescent protein 
(see, e.g., WO 00/44914). For example, a modified dsRNA homologous to the GFP mRNA 
can be assayed for the ability to inhibit GFP expression by monitoring for a decrease in cell 
fluorescence, as compared to a control cell, in which the transfection did not include the 
candidate dsRNA, e.g., controls with no agent added and/or controls with a non-modified 
RNA added. Efficacy of the candidate agent on gene expression can be assessed by 
comparing cell fluorescence in the presence of the modified and unmodified dsRNA agents. 

In an alternative functional assay, a candidate dsRNA agent homologous to an 
endogenous mouse gene, preferably a maternally expressed gene, such as c-mos, can be 
injected into an immature mouse oocyte to assess the ability of the agent to inhibit gene 
expression in vivo (see, e.g., WO 01/36646). A phenotype of the oocyte, e.g., the ability to 
maintain arrest in metaphase II, can be monitored as an indicator that the agent is inhibiting 
expression. For example, cleavage of c-mos mRNA by a dsRNA agent would cause the 
oocyte to exit metaphase arrest and initiate parthenogenetic development (Colledge et al. 
Nature 370: 65-68, 1994; Hashimoto etal. Nature, 370:68-71, 1994). The effect of the 
modified agent on target RNA levels can be verified by Northern blot to assay for a decrease 
in the level of target mRNA, or by Western blot to assay for a decrease in the level of target 
protein, as compared to a negative control. Controls can include cells in which with no agent 
is added and/or cells in which a non-modified RNA is added. 
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as is described in U.S. Pat. No. 5,223,618. Siloxane replacements are described in 
Cormier,J.F. et al. Nucleic Acids Res. 1988, 16, 4583. Carbonate replacements are described 
in Tittensor, J.R. J. Chem. Soc. C 1971, 1933. Carboxymethyl replacements are described in 
Edge, M.D. et al. J. Chem. Soc. Perkin Trans. 1 1972, 1991. Carbamate replacements are 
described in Stirchak, E.P. Nucleic Acids Res. 1989, 17, 6129. 

Re placement of the Phosphate-Ribose Bac kbone References 
Cyclobutyl sugar surrogate compounds can be prepared as is described in U.S. Pat. 
No. 5,359,044. Pyrrolidine sugar surrogate can be prepared as is described in U.S. Pat. No. 



78 



WO 2004/080406 



PCT/US2004/007070 



5,519,134. Morpholino sugar surrogates can be prepared as is described in U.S. Pat. Nos. 
5,142,047 and 5,235,033, and other related patent disclosures. Peptide Nucleic Acids (PNAs) 
are known per se and can be prepared in accordance with any of the various procedures 
referred to in Peptide Nucleic Acids (PNA): Synthesis, Properties and Potential Applications, 
Bioorganic & Medicinal Chemistry, 1996, 4, 5-23. They may also be prepared in accordance 
with U.S. Pat. No. 5,539,083. 

Terminal Modification References 

Terminal modifications are described in Manoharan, M. et al. Antisense and Nucleic 
Acid Drug Development 12, 103-128 (2002) and references therein. 



Bases References 

N-2 substitued purine nucleoside amidites can be prepared as is described in U.S. Pat. 
No. 5,459,255. 3-Deaza purine nucleoside amidites can be prepared as is described in U.S. 
Pat. No. 5,457,191. 5,6-Substituted pyrimidine nucleoside amidites can be prepared as is 
described in U.S. Pat. No. 5,614,617. 5-Propynyl pyrimidine nucleoside amidites can be 
prepared as is described in U.S. Pat. No. 5,484,908. Additional references can be disclosed 
in the above section on base modifications. 
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Preferred iRNA Agents 

Preferred RNA agents have the following structure (see Formula 2 below): 




FORMULA 2 



Referring to Formula 2 above, R 1 , R 2 , and R 3 are each, independently, H, (i.e. abasic 
nucleotides), adenine, guanine, cytosine and uracil, inosine, thymine, xanthine, 
hypoxanthine, nubularine, tubercidine, isoguanisine, 2-aminoadenine, 6-methyl and other 
alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and 
guanine, 5-halouracil and cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and 
thymine, 5-uracil (pseudouracil), 4-thiouracil, 5-halouracil, 5-(2-aminopropyl)uracil, 5-amino 
allyl uracil, 8-halo, amino, thiol, thioalkyl, hydroxyl and other 8-substituted adenines and 
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guanines, 5-trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine, 
5-substituted pyrimidines, 6-azapyrimidines andN-2, N-6 and 0-6 substituted purines, 
including 2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine, dibydrouracil, 3- 
deaza-5-azacytosine, 2-aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,7- 
deazaadenine, 7-deazaguanine, N6, N6-dimethyladenine, 2,6-diaminopurine, 5-amino-allyl- 
uracil, N3-methyluracil, substituted 1,2,4-triazoles, 2-pyridinone, 5-nitroindole, 3- 
nitropyrrole, 5-methoxyuracil, uracil-5-oxyacetic acid, 5-methoxycarbonylmethyluracil, 5- 
methyl-2-thiouracil,5-methoxycarbonylmethyl-2-thiouracil, 5-methylaminomethyl-2- 
thiouracil, 3-(3-amino-3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N 4 -acetyl 
cytosine, 2-thiocytosine, N6-methyladenine, N6-isopentyladenine, 2-methylthio-N6- 
isopentenyladenine, N-methylguanines, or O-alkylated bases. 

R 4 , R 5 , and R 6 are each, independently, OR 8 , 0(CH 2 CH 2 0) m CH 2 CH 2 OR 8 ; 
0(CH 2 ) n R 9 ; 0(CH 2 ) n OR 9 , H; halo; NH 2 ; NHR 8 ; N(R 8 ) 2 ; NH(CH 2 CH 2 NH) m CH 2 CH 2 NHR 9 ; 
NHC(0)R s ; ; cyano; mercapto, SR 8 ; alkyl-thio-alkyl; alkyl, aralkyl, cycloalkyl, aryl, 
heteroaryl, alkenyl, alkynyl, each of which may be optionally substituted with halo, hydroxy, 
oxo, nitro, haloalkyl, alkyl, alkaryl, aryl, aralkyl, alkoxy, aryloxy, amino, allcylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, diheteroaryl amino, 
acylamino, alkylcarbamoyl, arylcarbamoyl, aminoalkyl, alkoxycarbonyl, carboxy, 
hydroxyalkyl, alkanesulfonyl, alkanesulfonamido, arenesulfonamido, aralkylsulfonamido, 
alkylcarbonyl, acyloxy, cyano, or ureido; or R 4 , R 5 , or R 6 together combine with R 7 to form 
an [-O-CH2-] covalently bound bridge between the sugar 2' and 4' carbons. 
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I 

X 1 =P Y 1 



T 

Xi=P Y 1 

T 

X 1= P Y 1 



W-, 




—v, 






x,= 


r 


x,=Lv, 



; H; OH; OCH 3 ; W 1 ; an abasic nucleotide; or absent; 

(a preferred Al , especially with regard to anti-sense strands, is chosen from 5'- 
monophosphate ((HO) 2 (0)P-0-5'), 5'-diphosphate ((HO) 2 (0)P-0-P(HO)(0)-0-5'), 5'- 
triphosphate ((HO) 2 (0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'), 5'-guanosine cap (7-methylated or 
non-methylated) (Tm-G-O-S'-CHOXOP-O-CHOXOP-O-P^^-O-S 1 ), 5'-adenosine cap 
(Appp), and any modified or unmodified nucleotide cap structure (N-0-5 , -(HO)(0)P-0- 
(H0)(O)P-0-P(H0)(O)-0-5'), 5'-monothiophosphate (phosphorothioate; (HO) 2 (S)P-0-5'), 5'- 
monodithiophosphate (phosphorodithioate; (HO)(HS)(S)P-0-5') 5 5'-phosphorothiolate 
((HO) 2 (0)P-S-5'); any additional combination of oxgen/sulfur replaced monophosphate, 
diphosphate and triphosphates (e.g. 5'-alpha-miotriphosphate, S'-gamma-thiotriphosphate, 
etc.), 5'-phosphoramidates ((HO) 2 (0)P-NH-5', (HO)(NH 2 )(0)P-0-5'), 5'-alkylphosphonates 
(R=alkyl=methyl, ethyl, isopropyl, propyl, etc., e.g. RP(OH)(0)-0-5'-, (OH) 2 (0)P-5'-CH 2 -), 
5'-alkyletherphosphonates (R=ancylether=methoxymethyl (MeOCH 2 -), ethoxymethyl, etc., 
e.g.RP(OH)(0)-0-5'-)). 
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A 2 is: 



X 2 =P Y 2 



10 A is: 




; and 
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A 4 is: 




Z 4 Z 4 z 4 



; H; Z 4 ; an inverted nucleotide; an abasic nucleotide; or absent. 

W 1 is OH, (CH 2 )„R 10 , (CH 2 ) n NHR 10 , (CH 2 ) n OR 10 , (CH 2 ) n SR 10 ; 0(CH 2 ) n R 10 ; 
0(CH 2 ) n OR 10 , 0(CH 2 ) n NR 10 , 0(CH 2 )„SR 10 ; 0(CH 2 ) n SS(CH 2 )„OR 10 , 0(CH 2 ) n C(0)OR 10 , 
NH(CH 2 ) n R 10 ;NH(CH 2 ) n NR 10 ;NH(CH 2 ) n OR 10 , NH(CH 2 ) n SR 10 ; S(CH 2 ) n R 10 , S(CH 2 ) n NR 10 , 
S(CH 2 ) n OR 10 , S(CH 2 ) n SR 10 0(CH 2 CH 2 0) m CH 2 CH 2 OR 10 ; 0(CH 2 CH 2 0) m CH 2 CH 2 NHR 10 , 
NH(CH 2 CH 2 NH) m CH 2 CH 2 NHR 10 ; Q-R 10 , 0-Q-R 10 N-Q-R ]0 , S-Q-R 10 or -0-. W 4 is O, CH 2 , 
NH, or S. 

X 1 , X 2 , X 3 , and X 4 are each, independently, O or S. 

Y 1 , Y 2 , Y 3 , and Y 4 are each, independently, OH, O', OR 8 , S, Se, BH 3 ", H, NHR 9 , 
N(R 9 ) 2 alkyl, cycloalkyl, aralkyl, aryl, or heteroaryl, each of which may be optionally 
substituted. 

Z 1 , Z 2 , and Z 3 are each independently O, CH 2 , NH, or S. Z 4 is OH, (CH 2 ) n R 10 , 
(CH 2 ) n NHR 10 , (CH 2 ) n OR 10 , (CH 2 )„ SR 10 ; 0(CH 2 ) n R 10 ; 0(CH 2 )„OR 10 , 0(CH 2 ) n NR 10 , 
0(CH 2 ) n SR 10 , 0(CH 2 ) n SS(CH 2 ) n OR 10 , 0(CH 2 ) n C(0)OR 10 ; NH(CH 2 ) n R 10 ; NH(CH 2 ) n NR 10 
;NH(CH 2 ) n OR 10 , NH(CH 2 )„SR 10 ; S(CH 2 ) n R 10 , S(CH 2 ) n NR 10 , S(CH 2 ) n OR 10 , S(CH 2 )„SR 10 
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0(CH 2 CH 2 0) m CH 2 CH 2 OR 10 , 0(CH 2 CH 2 0) m CH 2 CH 2 NHR 10 , 
NH(CH 2 CH 2 NH) m CH 2 CH 2 NHR 10 ; Q-R 10 , O-Q-R 10 N-Q-R 10 , S-Q-R 10 . 

x is 5-100, chosen to comply with a length for an RNA agent described herein. 
R 7 is H; or is together combined with R 4 , R 5 , or R 6 to form an [-0-CH 2 -] covalently 
bound bridge between the sugar 2' and 4' carbons. 

R 8 is alkyl, cycloalkyl, aryl, aralkyl, heterocyclyl, heteroaryl, amino acid, or sugar; R 9 
is NH 2 , alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, 
diheteroaryl amino, or amino acid; and R 10 is H; fluorophore (pyrene, TAMRA, fluorescein, 
Cy3 or Cy5 dyes); sulfur, silicon, boron or ester protecting group; intercalating agents (e.g. 
acridines), cross-linkers (e.g. psoralene, mitomycin C), porphyrins (TPPC4,texaphyrin, 
Sapphyrin), polycyclic aromatic hydrocarbons (e.g., phenazine, dihydrophenazine), artificial 
endonucleases (e.g. EDTA), lipohilic carriers (cholesterol, cholic acid, adamantane acetic 
acid, 1 -pyrene butyric acid, dihydrotestosterone, l,3-Bis-0(hexadecyl)glycerol, 
geranyloxyhexyl group, hexadecylglycerol, borneol, menthol, 1,3-propanediol, heptadecyl 
group, palmitic acid,myristic acid,03-(oleoyl)lithocholic acid, 03-(oleoyl)cholenic acid, 
dimethoxytrityl, or phenoxazine)and peptide conjugates (e.g., antennapedia peptide, Tat 
peptide), alkylating agents, phosphate, amino, mercapto, PEG (e.g., PEG-40K), MPEG, 
[MPEG] 2 , polyamino; alkyl, cycloalkyl, aryl, aralkyl, heteroaryl; radiolabelled markers, 
enzymes, haptens (e.g. biotin), transport/absorption facilitators (e.g., aspirin, vitamin E, folii 
acid), synthetic ribonucleases (e.g., imidazole, bisimidazole, histamine, imidazole clusters, 
acridine-imidazole conjugates, Eu3+ complexes of tetraazamacrocycles); or an RNA agent, 
m is 0-1,000,000, and n is 0-20. Q is a spacer selected from the group consisting of abasic 
sugar, amide, carboxy, oxyamine, oxyimine, thioether, disulfide, thiourea, sulfonamide, or 
morpholino, biotin or fluorescein reagents. 
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Preferred RNA agents in which the entire phosphate group has been replaced have the 
following structure (see Formula 3 below): 




^40 "R60 



FORMULA 3 

Referring to Formula 3, A 10 -A 40 is L-G-L; A 10 and/or A 40 may be absent, in which L 
is a linker, wherein one or both L may be present or absent and is selected from the group 
consisting of CH 2 (CH 2 ) g ; N(CH 2 ) g ; 0(CH 2 ) g ; S(CH 2 ) g . G is a functional group selected fro D 
the group consisting of siloxane, carbonate, carboxymethyl, carbamate, amide, thioether, 
ethylene oxide linker, sulfonate, sulfonamide, thioformacetal, formacetal, oxime, 
methyleneimino, methylenemethylimino, methylenehydrazo, methylenedimethylhydrazo an 
methyleneoxymethylimino. 
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R 10 , R 20 , and R 30 are each, independently, H, (i.e. abasic nucleotides), adenine, 
guanine, cytosine and uracil, inosine, thymine, xanthine, hypoxanthine, nubularine, 
tubercidine, isoguanisine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine 
and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 5-halouracil and 
cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil 
(pseudouracil), 4-thiouracil, 5-halouracil, 5-(2-aminopropyl)uracil, 5-amino allyl uracil, 8- 
halo, amino, thiol, thioalkyl, hydroxyl and other 8-substituted adenines and guanines, 5- 
trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine, 5-substituted 
pyrimidines, 6-azapyrimidines andN-2, N-6 and 0-6 substituted purines, including 2- 
aminopropyladenine, 5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3-deaza-5- 
azacytosine, 2-aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,7-deazaadenine, 
7-deazaguanine, N6, N6-dimethyladenine, 2,6-diaminopurine, 5-amino-allyl-uracil, N3- 
methyluracil substituted 1,2,4-triazoles, 2-pyridinone, 5-nitroindole, 3-nitropyrrole, 5- 
methoxyuracil, uracil-5-oxyacetic acid, 5-methoxycarbonylmethyluracil, 5-methyl-2- 
thiouracil, 5-methoxycarbonylmethyl-2-thiouracil, 5-methylaminomethyl-2-thiouracil, 3-(3- 
amino-3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N 4 -acetyl cytosine, 2- 
thiocytosine, N6-methyladenine, N6-isopentyladenine, 2-memylthio-N6-isopentenyladenine, 
N-methylguanines, or O-alkylated bases. 

R 40 , R 50 , and R 60 are each, independently, OR 8 , O(CH 2 CH 2 0) m CH 2 CH 2 0R 8 ; 
0(CH 2 ) n R 9 ; 0(CH 2 ) n OR 9 , H; halo; NH 2 ; NHR 8 ; N(R 8 ) 2 ; NH(CH 2 CH 2 NH) m CH 2 CH 2 R 9 ; 
NHC(0)R 8 ;; cyano; mercapto, SR 7 ; alkyl-thio-alkyl; alkyl, aralkyl, cycloalkyl, aryl, 
heteroaryl, alkenyl, alkynyl, each of which may be optionally substituted with halo, hydroxy, 
oxo, nitro, haloalkyl, alkyl, alkaryl, aryl, aralkyl, alkoxy, aryloxy, amino, alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, diheteroaryl amino, 
acylamino, alkylcarbamoyl, arylcarbamoyl, aminoalkyl, alkoxycarbonyl, caiboxy, 
hydroxyalkyl, alkanesulfonyl, alkanesulfonamido, arenesulfonamido, aralkylsulfonamido, 
alkylcarbonyl, acyloxy, cyano, and ureido groups; or R 40 , R 50 , or R 60 together combine with 
R 70 to form an [-0-CH 2 -] covalently bound bridge between the sugar 2' and 4' carbons, 
x is 5-100 or chosen to comply with a length for an RNA agent described herein. 
R 70 is H; or is together combined with R 40 , R 50 , or R 60 to form an [-0-CH 2 -] 
covalently bound bridge between the sugar 2' and 4' carbons. 
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R 8 is alkyl, cycloalkyl, aryl, aralkyl, heterocyclyl, heteroaryl, amino acid, or sugar; 
and R 9 is NH 2 , alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl 
amino, diheteroaryl amino, or amino acid, m is 0-1,000,000, n is 0-20, and g is 0-2. 

Preferred nucleoside surrogates have the following structure (see Formula 4 below): 

5 

SLR 100 -(M-SLR 20 VM-SLR 300 
FORMULA 4 

S is a nucleoside surrogate selected from the group consisting of mophilino, 

1 o cyclobutyl, pyrrolidine and peptide nucleic acid. L is a linker and is selected from the group 
consisting of CH 2 (CH 2 ) g ; N(CH 2 ) g ; 0(CH 2 ) g ; S(CH 2 ) g ; -C(0)(CH 2 )„-or may be absent. M is 
an amide bond; sulfonamide; sulfmate; phosphate group; modified phosphate group as 
described herein; or may be absent. 

R 100 , R 200 , and R 300 are each, independently, H (i.e., abasic nucleotides), adenine, 

15 guanine, cytosine and uracil, inosine, thymine, xanthine, hypoxanthine, nubularine, 

rubercidine, isoguanisine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine 
and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 5-halouracil and 
cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, 5 -uracil 
(pseudouracil), 4-thiouracil, 5-halouracil, 5-(2-aminopropyl)uracil, 5-amino allyl uracil, 8- 

20 halo, amino, thiol, thioalkyl, hydroxyl and other 8-substituted adenines and guanines, 5- 

trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine, 5-substituted 
pyrimidines, 6-azapyrimidines and N-2, N-6 and 0-6 substituted purines, including 2- 
aminopropyladenine, 5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3-deaza-5- 
azacytosine, 2-aminopurine, 5-alkyluracil, 7-alkylguanine, 5 -alkyl cytosine,7-deazaadenine, 

25 7-deazaguanine, N6, N6-dimethyladenine, 2,6-diaminopurine, 5-amino-allyl-uracil, N3- 
methyluracil substituted 1, 2, 4,-triazoles, 2-pyridinones, 5-nitroindole, 3-nitropyrrole, 5- 
methoxyuracil, uracil-5-oxyacetic acid, 5-methoxycarbonylmethyluracil, 5-methyl-2- 
thiouracil, 5-methoxycarbonylmethyl-2-thiouracil, 5-methylaminomethyl-2-thiouracil, 3-(3- 
amino-3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N 4 -acetyl cytosine, 2- 
30 tliiocytosine, N6-methyladenine, N6-isopentyladenine, 2-methyltliio-N6-isopentenyladenine, 
N-methylguanines, or O-alkylated bases. 
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x is 5-100, or chosen to comply with a length for an RNA agent described herein; and 
g is 0-2. 

Nuclease resistant monomers 

In one aspect, the invention features a nuclease resistant monomer, or a an iRNA 
agent which incorporates a nuclease resistant monomer (NMR), such as those described 
herein and those described in copending, co-owned United States Provisional Application 
Serial No. 60/469,612 (Attorney Docket No. 14174-069P01), filed on May 9, 2003, which is 
hereby incorporated by reference. 

In addition, the invention includes iRNA agents having a NMR and another element 
described herein. E.g., the invention includes an iRNA agent described herein, e.g., a 
palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent having 
an architecture or staicture described herein, an iRNA associated with an amphipathic 
delivery agent described herein, an iRNA associated with a drug delivery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates a NMR. 

An iRNA agent can include monomers which have been modifed so as to inhibit 
degradation, e.g., by nucleases, e.g., endonucleases or exonucleases, found in the body of a 
subject. These monomers are referred to herein as NRM's, or nuclease resistance promoting 
monomers or modifications. In many cases these modifications will modulate other 
properties of the iRNA agent as well, e.g., the ability to interact with a protein, e.g., a 
transport protein, e.g., serum albumin, or a member of the RISC (RNA-induced Silencing 
Complex), or the ability of the first and second sequences to form a duplex with one another 
or to form a duplex with another sequence, e.g., a target molecule. 

While not wishing to be bound by theory, it is believed that modifications of the 
sugar, base, and/or phosphate backbone in an iRNA agent can enhance endonuclease and 
exonuclease resistance, and can enhance interactions with transporter proteins and one or 
more of the functional components of the RISC complex. Preferred modifications are those 
that increase exonuclease and endonuclease resistance and thus prolong the halflife of the 
iRNA agent prior to interaction with the RISC complex, but at the same time do not render 
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the iRNA agent resistant to endonuclease activity in the RISC complex. Again, while not 
wishing to be bound by any theory, it is believed that placement of the modifications at or 
near the 3 ' and/or 5 ' end of antisense strands can result in iRNA agents that meet the 
preferred nuclease resistance criteria delineated above. Again, still while not wishing to be 
bound by any theory, it is believed that placement of the modifications at e.g., the middle of a 
sense strand can result in iRNA agents that are relatively less likely to undergo off-targeting. 

Modifications described herein can be incorporated into any double-standed RNA and 
RNA-like molecule described herein, e.g., an iRNA agent. An iRNA agent may include a 
duplex comprising a hybridized sense and antisense strand, in which the antisense strand 
and/or the sense strand may include one or more of the modifications described herein. The 
anti sense strand may include modifications at the 3' end and/or the 5' end and/or at one or 
more positions that occur 1-6 (e.g., 1-5, 1-4, 1-3, 1-2) nucleotides from either end of the 
strand. The sense strand may include modifications at the 3' end and/or the 5' end and/or at 
any one of the intervening positions between the two ends of the strand. The iRNA agent 
may also include a duplex comprising two hybridized antisense strands. The first and/or the 
second antisense strand may include one or more of the modifications described herein. 
Thus, one and/or both antisense strands may include modifications at the 3' end and/or the 5' 
end and/or at one or more positions that occur 1-6 (e.g., 1-5, 1-4, 1-3, 1-2) nucleotides from 
either end of the strand. Particular configurations are discussed below. 

Modifications that can be useful for producing iRNA agents that meet the preferred 
nuclease resistance criteria delineated above can include one or more of the following 
chemical and/or stereochemical modifications of the sugar, base, and/or phosphate backbone: 

(i) chiral (S P ) tliioates. Thus, preferred NRM's include nucleotide dimers with an 
enriched or pure for a particular chiral form of a modified phosphate group containing a 
heteroatom at the nonbridging position, e.g., Sp or Rp, at the position X, where this is the 
position normally occupied by the oxygen. The atom at X can also be S, Se, Nr 2 , or Br 3 . 
When X is S, enriched or chirally pure Sp linkage is preferred. Enriched means at least 70, 
80, 90, 95, or 99% of the preferred fonn. Such NRM's are discussed in more detail below; 

(ii) attachment of one or more cationic groups to the sugar, base, and/or the 
phosphorus atom of a phosphate or modified phosphate backbone moiety. Thus, preferred 
NRM's include monomers at the terminal position derivitized at a cationic group. As the 5' 
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end of an antisense sequence should have a terminal -OH or phosphate group this NRM is 
preferraly not used at th 5' end of an anti-sense sequence. The group should be attached at a 
position on the base which minimizes intererence with H bond formation and hybridization, 
e.g., away form the face which intereacts with the complementary base on the other strand, 
e.g, at the 5' position of a pyrimidine or a 7-position of a purine. These are discussed in 
more detail below; 

(iii) nonphosphate linkages at the termini. Thus, preferred NRM's include Non- 
phosphate linkages, e.g., a linkage of 4 atoms which confers greater resistance to cleavage 
than does a phosphate bond. Examples include 3' CH2-NCH 3 -0-CH2-5' and 3' CH2-NH- 
(0=)-CH2-5\; 

(iv) 3'-bridging thiophosphates and 5'-bridging thiophosphates. Thus, preferred 
NRM's can inlcuded these structures; 

(v) L-RNA, T -5' likages, inverted linkages, a-nucleosides. Thus, other preferred 
NRM's include: L nucleosides and dimeric nucleotides derived from L-nucleosides; 2'-5' 
phosphate, non-phosphate and modified phosphate linkages (e.g., thiophospahtes, 
phosphoramidates and boronophosphates); dimers having inverted linkages, e.g., 3'-3' or 5'- 
5' linkages; monomers having an alpha linkage at the 1' site on the sugar, e.g., the structures 
described herein having an alpha linkage; 

(vi) conjugate groups. Thus, preferred NRM's can include e.g., a targeting moiety or 
a conjugated ligand described herein conjugated with the monomer, e.g., through the sugar , 
base, or backbone ; 

(vi) abasic linkages. Thus, preferred NRM's can include an abasic monomer, e.g., an 
abasic monomer as described herein (e.g., a nucleobaseless monomer); an aromatic or 
heterocyclic or polyheterocyclic aromatic monomer as described herein.; and 

(vii) 5 '-phosphorates and 5'-phosphate prodrugs. Thus, preferred NRM's include 
monomers, preferably at the terminal position, e.g., the 5' position, in which one or more 
atoms of the phosphate group is derivatized with a protecting group, which protecting group 
or groups, are removed as a result of the action of a component in the subject's body, e.g, a 
carboxyesterase or an enzyme present in the subject's body. E.g., a phosphate prodrug in 
which a carboxy esterase cleaves the protected molecule resulting in the production of a 
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thioate anion which attacks a carbon adjacent to the O of a phosphate and resulting in the 
production of an uprotected phosphate. 

One or more different NRM modifications can be introduced into an iRNA agent or 
into a sequence of an iRNA agent. An NRM modification can be used more than once in a 
sequence or in an iRNA agent. As some NRM's interfere with hybridization the total 
number incorporated, should be such that acceptable levels of iRNA agent duplex formation 
are maintainted. 

In some embodiments NRM modifications are introduced into the terminal the 
cleavage site or in the cleavage region of a sequence (a sense strand or sequence) which does 
not target a desired sequence or gene in the subject. This can reduce off-target silencing. 

ChiralSpThioates 

A modification can include the alteration, e.g., replacement, of one or both of the 
non-linking (X and Y) phosphate oxygens and/or of one or more of the linking (W and Z) 
phosphate oxygens. Formula X below depicts a phosphate moiety linking two sugar/sugar 
surrogate-base moities, SBi and SB 2 . 

SB 1 

W 

X=P Y 




FORMULA 



In certain embodiments, one of the non-linking phosphate oxygens in the phosphate 
backbone moiety (X and Y) can be replaced by any one of the following: S, Se, BR 3 (R is 
hydrogen, alkyl, aryl, etc.), C (i.e., an alkyl group, an aryl group, etc.), H, NR 2 (R is 
hydrogen, alkyl, aryl, etc.), or OR (R is alkyl or aryl). The phosphorus atom in an 
unmodified phosphate group is achiral. However, replacement of one of the non-linking 
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oxygens with one of the above atoms or groups of atoms renders the phosphorus atom chiral; 
in other words a phosphorus atom in a phosphate group modified in this way is a stereogenic 
center. The stereogenic phosphorus atom can possess either the "R" configuration (herein 
R P ) or the "S" configuration (herein S P ). Thus if 60% of a population of stereogenic 
5 phosphorus atoms have the R P configuration, then the remaining 40% of the population of 
stereogenic phosphorus atoms have the S P configuration. 

In some embodiments, iRNA agents, having phosphate groups in which a phosphate 
non-linking oxygen has been replaced by another atom or group of atoms, may contain a 
population of stereogenic phosphorus atoms in which at least about 50% of these atoms (e.g., 

10 at least about 60% of these atoms, at least about 70% of these atoms, at least about 80% of 
these atoms, at least about 90% of these atoms, at least about 95% of these atoms, at least 
about 98% of these atoms, at least about 99% of these atoms) have the S P configuration. 
Alternatively, iRNA agents having phosphate groups in which a phosphate non-linking 
oxygen has been replaced by another atom or group of atoms may contain a population of 

15 stereogenic phosphorus atoms in which at least about 50% of these atoms (e.g., at least about 
60% of these atoms, at least about 70% of these atoms, at least about 80% of these atoms, at 
least about 90% of these atoms, at least about 95% of these atoms, at least about 98% of 
these atoms, at least about 99% of these atoms) have the R P configuration. In other 
embodiments, the population of stereogenic phosphorus atoms may have the S P 

20 configuration and may be substantially free of stereogenic phosphorus atoms having the R P 
configuration. In still other embodiments, the population of stereogenic phosphorus atoms 
may have the R P configuration and may be substantially free of stereogenic phosphorus 
atoms having the S P configuration. As used herein, die phrase "substantially free of 
stereogenic phosphorus atoms having the R P configuration" means that moieties containing 

25 stereogenic phosphorus atoms having the R P configuration cannot be detected by 

conventional methods known in the art (chiral HPLC, *H NMR analysis using chiral shift 
reagents, etc.). As used herein, the phrase "substantially free of stereogenic phosphoms 
atoms having the S P configuration" means that moieties containing stereogenic phosphorus 
atoms having the S P configuration cannot be detected by conventional methods known in the 

30 art (chiral HPLC, ! H NMR analysis using chiral shift reagents, etc.). 
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In a preferred embodiment, modified iRNA agents contain a phosphorothioate group, 
i.e., a phosphate groups in which a phosphate non-linking oxygen has been replaced by a 
sulfur atom. In an especially preferred embodiment, the population of phosphorothioate 
stereogenic phosphorus atoms may have the S P configuration and be substantially free of 
stereogenic phosphorus atoms having the R P configuration. 

Phosphorothioates may be incorporated into iRNA agents using dimers e.g., formulas 
X-1 and X-2. The former can be used to introduce phosphorothioate 




BASE 



solid phase reagent 




N(ipr) 2 



X-1 X-2 

at the 3' end of a strand, while the latter can be used to introduce this modification at the 5' 
end or at a position that occurs e.g., 1, 2, 3, 4, 5, or 6 nucleotides from either end of the 
strand. In the above formulas, Y can be 2-cyanoethoxy, W and Z can be O, R 2 - can be, e.g., 
substituent that can impart the C-3 endo configuration to the sugar (e.g., OH, F, OCH 3 ), 
DMT is dimethoxytrityl, and "BASE" can be a natural, unusual, or a universal base. 
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X-l and X-2 can be prepared using chiral reagents or directing groups that can result 
in phosphorotruoate-containing dimers having a population of stereogenic phosphorus atoms 
having essentially only the R P configuration (i.e., being substantially free of the S P 
configuration) or only the S P configuration (i.e., being substantiaUy fiee of the R P 
configuration). Alternatively, dimers can be prepared having a population of stereogenic 
phosphorus atoms in which about 50% of the atoms have the R P configuration and about 
50% of the atoms have the S P configuration. Dimers having stereogenic phosphorus atoms 
with the R P configuration can be identified and separated from dimers having stereogenic 
phosphorus atoms with the S P configuration using e.g., enzymatic degradation and/or 
conventional chromatography techniques. 

Cationic Groups 

Modifications can also include attachment of one or more cationic groups to the 
sugar, base, and/or the phosphorus atom of a phosphate or modified phosphate backbone 
moiety. A cationic group can be attached to any atom capable of substitution on a natural, 
unusual or universal base. A preferred position is one that does not interfere with 
hybridization, i.e., does not interfere with the hydrogen bonding interactions needed for base 
pairing. A cationic group can be attached e.g., through the C2' position of a sugar or 
analogous position in a cyclic or acyclic sugar surrogate. Cationic groups can include e.g., 
protonated amino groups, derived from e.g., O-AMINE (AMINE = NH 2 ; alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, or diheteroaryl 
amino, ethylene diamine, polyamino); aminoalkoxy, e.g., 0(CH 2 ) n AMINE, {e.g., AMINE = 
NH 2 ; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, or 
diheteroaryl amino, ethylene diamine, polyamino); amino (e.g. NH 2 ; alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, diheteroaryl amino, 
or amino acid); or NH(CH 2 CH 2 NH) n CH 2 CH 2 -AMINE (AMINE = NH 2 ; alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino,or diheteroaryl 
amino). 

Nonphosphate Linkages 

Modifications can also include the incorporation of nonphosphate linkages at the 5' 
and/or 3' end of a strand. Examples of nonphosphate linkages which can replace the 
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phosphate group include methyl phosphonate, hydroxylamino, siloxane, carbonate, 
carboxymethyl, carbamate, amide, thioether, ethylene oxide linker, sulfonate, sulfonamide, 
thioformacetal, formacetal, oxime, methyleneimino, methylenemethylimino, 
methylenehydrazo, methylenedimethylhydrazo and methyleneoxymethyhmino. Preferred 
replacements include the methyl phosphonate and hydroxylamino groups. 

3 '-bridging thiophosphates and 5 '-bridging thiophosphates; locked-RNA, 2 '-5 ' 
likages, inverted linkages, a-nucleosides; conjugate groups; abasic linkages; and 5'- 
phosphonates and 5 '-phosphate prodrugs 

Referring to formula X above, modifications can include replacement of one of the 
bridging or linking phosphate oxygens in the phosphate backbone moiety (W and Z). Unlike 
the situation where only one of X or Y is altered, the phosphorus center in the 
phosphorodithioates is achiral which precludes the formation of iRNA agents containing a 
stereogenic phosphorus atom- 
Modifications can also include linking two sugars via a phosphate or modified 
phosphate group through the T position of a first sugar and the 5' position of a second sugar. 
Also contemplated are inverted linkages in which both a first and second sugar are eached 
linked through the respective3 ' positions. Modified RNA's can also include "abasic" sugars, 
which lack a nucleobase at C-l'. The sugar group can also contain one or more carbons that 
possess the opposite stereochemical configuration than that of the corresponding carbon in 
ribose. Thus, a modified iRNA agent can include nucleotides containing e.g., arabinose, as 
the sugar. In another subset of tiiis modification, the natural, unusual, or universal base may 

have the a-configuration. Modifcations can also include L-RNA. 

Modifications can also include 5'-phosphonates, e.g., P(0)(0>X-C 5 -sugar (X= 

CH2, CF2, CHF and 5'-phosphate prodrugs, e.g., P(0)[OCH2CH2SC(0)R] 2 CH 2 C 5 '-sugar. 

In the latter case, the prodrug groups may be decomposed via reaction first with carboxy 

esterases. The remaining ethyl thiolate group via intramolecular S N 2 displacement can depart 

as episulfide to afford the underivatized phosphate group. 

Modification can also include the addition of conjugating groups described elseqhere 

herein, which are prefereably attached to an iRNA agent through any amino group available 

for conjugation. 
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Nuclease resistant modifications include some which can be placed only at the 
terminus and others which can go at any position. Generally the modifications that can 
inhibit hybridization so it is preferably to use them only in terminal regions, and preferrable 
to not use them at the cleavage site or in the cleavage region of an sequence which targets a 
subject sequence or gene.. The can be used anywhere in a sense sequence, provided that 
sufficient hybridization between the two sequences of the iRNA agent is maintained. In 
some embodiments it is desirabable to put the NRM at the cleavage site or in the cleavage 
region of a sequence which does not target a subject sequence or gene.as it can minimize off- 
target silencing. 

In addition, an iRNA agent described herein can have an overhang which does not 
form a duplex structure with the other sequence of the iRNA agent-it is an overhang, but it 
does hybridize, either with itself, or with another nucleic acid, other than the other sequence 
of the iRNA agent. 

In most cases, the nuclease-resistance promoting modifications will be distributed 
differently depending on whether the sequence will target a sequence in the subject (often' 
referred to as an anti-sense sequence) or will not target a sequence in the subject (often 
referred to as a sense sequence). If a sequence is to target a sequence in the subject, 
modifications which interfer with or inhibit endonuclease cleavage should not be inserted in 
the region which is subject to RISC mediated cleavage, e.g., the cleavage site or the cleavage 
region (As described in Elbashir et al, 2001, Genes and Dev. 15: 188, hereby incorporated 
by reference, cleavage of the target occurs about in the middle of a 20 or 21 nt guide RNA, or 
about 10 or 1 1 nucleotides upstream of the first nucleotide which is complementary to the 
guide sequence. As used herein cleavage site refers to the nucleotide on either side of the 
cleavage site, on the target or on the iRNA agent strand which hybridizes to it. Cleavage 
region means an nucleotide with 1 , 2, or 3 nucletides of the cleave site, in either direction.) 

Such modifications can be introduced into the terminal regions, e.g., at the terminal 
position or with 2, 3, 4, or 5 positions of the terminus, of a sequence which targets or a 
sequence which does not target a sequence in the subject. 

An iRNA agent can have a first and a second strand chosen from the following: 
a first strand which does not target a sequence and which has an NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end; 
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a first strand which does not target a sequence and which has an NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a first strand which does not target a sequence and which has an NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end and whichhas a NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a first strand which does not target a sequence and which has an NRM modification at 
the cleavage site or in the cleavage region; 

a first strand which does not target a sequence and which has an NRM modification at 
the cleavage site or in the cleavage region and one or more of an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at or within 1, 2, 3, 
4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1 , 2, 3, 4, 5 , or 6 
positions from both the 3' and the 5' end; and 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end; 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end (5' end NRM modifications are 
preferentially not at the terminus but rather at a position 1 , 2, 3, 4, 5 , or 6 away from the 5 • 
terminus of an antisense strand); 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a second strand which targets a sequence and which preferably does not have an an 
NRM modification at the cleavage site or in the cleavage region; 

a second strand which targets a sequence and which does not have an NRM 
modification at the cleavage site or in the cleavage region and one or more of an NRM 
modification at or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at 
or within 1 , 2, 3, 4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 
3 4 5,or6 P o S itionsfromborhthe3'andthe5'end(5'endNRMmodificationsare 
preferentially not at the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from the 5' 
terminus of an antisense strand). 
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An iRNA agent can also target two sequences and can have a first and second strand 
chosen from: 

a first strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end; 

a first strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end (5' end NRM modifications are 
preferentially not at the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from the 5' 
terminus of an antisense strand); 

a first strand which targets a sequence and which has an NRM modification at or 
within 1 , 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a first strand which targets a sequence and which preferably does not have an an 
NRM modification at the cleavage site or in the cleavage region; 

a first strand which targets a sequence and which dose not have an NRM modification 
at the cleavage site or in the cleavage region and one or more of an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at or within 1, 2, 3, 
4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 3, 4, 5 , or 6 
positions from both the 3' and the 5' end(5' end NRM modifications are preferentially not at 
the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from the 5' terminus of an 
antisense strand) and 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end; 

a second stand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end (5' end NRM modifications are 
preferentially not at the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from the 5' 
terminus of an antisense strand); 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a second strand which targets a sequence and which preferably does not have an an 
NRM modification at the cleavage site or in the cleavage region; 
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a second strand which targets a sequence and which dose not have an NRM 
modification at the cleavage site or in the cleavage region and one or more of an NRM 
modification at or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 
3, 4, 5 , or 6 positions from both the 3' and the 5' end(5' end NRM modifications are 
preferentially not at the terminus but rather at a position 1 , 2, 3, 4, 5 , or 6 away from the 5' 
terminus of an antisense strand). 

Ribose Mimics 

In one aspect, the invention features a ribose mimic, or an iRNA agent which 
incorporates a ribose mimic, such as those described herein and those described in copending 
co-owned United States Provisional Application Serial No. 60/454,962 (Attorney Docket No. 
14174-064P01), filed on March 13, 2003, which is hereby incorporated by reference. 

In addition, the invention includes iRNA agents having a ribose mimic and another 
element described herein. E.g., the invention includes an iRNA agent described herein, e.g., 
a palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent having 
an architecture or structure described herein, an iRNA associated with an amphipathic 
delivery agent described herein, an iRNA associated with a drug delivery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates a ribose mimic. 

Thus, an aspect of the invention features an iRNA agent that includes a secondary 
hydroxyl group, which can increase efficacy and/or confer nuclease resistance to the agent. 
Nucleases, e.g., cellular nucleases, can hydrolyze nucleic acid phosphodiester bonds, 
resulting in partial or complete degradation of the nucleic acid. The secondary hydroxy 
group confers nuclease resistance to an iRNA agent by rendering the iRNA agent less prone 
to nuclease degradation relative to an iRNA which lacks the modification. While not 
wishing to be bound by theory, it is believed that the presence of a secondary hydroxyl group 
on the iRNA agent can act as a structural mimic of a 3' ribose hydroxyl group, thereby 
causing it to be less susceptible to degradation. 
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The secondary hydroxyl group refers to an "OH" radical that is attached to a carbon 
atom substituted by two other carbons and a hydrogen. The secondary hydroxyl group that 
confers nuclease resistance as described above can be part of any acyclic carbon-containing 
group. The hydroxyl may also be part of any cyclic carbon-containing group, and preferably 
5 one or more of the following conditions is met (1) there is no ribose moiety between the 
hydroxyl group and the terminal phosphate group or (2) the hydroxyl group is not on a sugar 
moiety which is coupled to a base.. The hydroxyl group is located at least two bonds (e.g., at 
least three bonds away, at least four bonds away, at least five bonds away, at least six bonds 
away, at least seven bonds away, at least eight bonds away, at least nine bonds away, at least 

10 ten bonds away, etc.) from the terminal phosphate group phosphorus of the iRNA agent. In 
preferred embodiments, there are five intervening bonds between the terminal phosphate 
group phosphorus and the secondary hydroxyl group. 

Preferred iRNA agent delivery modules with five intervening bonds between the 
terminal phosphate group phosphorus and the secondary hydroxyl group have the following 

1 5 structure (see formula Y below) : 




00 

20 Referring to formula Y, A is an iRNA agent, including any iRNA agent described 

herein. The iRNA agent may be connected directly or indirectly (e.g., through a spacer or 
linker) to "W" of the phosphate group. These spacers or linkers can include e.g., -(CH 2 ) n -, - 
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(CH 2 ) n N-, -(CH 2 ) n O-, -(CH 2 ) n S-, 0(CH 2 CH 2 0) n CH 2 CH 2 OH {e.g., n = 3 or 6), abasic sugars, 
amide, carboxy, amine, oxyamine, oxyimine, thioether, disulfide, thiourea, sulfonamide, or 
morpholino, or biotin and fluorescein reagents. 

The iRNA agents can have a terminal phosphate group that is unmodified (e.g., W, X, 
Y, and Z are O) or modified. In a modified phosphate group, W and Z can be independently 
NH, O, or S; and X and Y can be independently S, Se, BH 3 \ C,-C 6 alkyl, C 6 -Ci 0 aryl, H, O, 
O", alkoxy or amino (including alkylamino, arylamino, etc.). Preferably, W, X and Z are O 
andYisS. 

Rj and R 3 are each, independently, hydrogen; or Ci-Cioo alkyl, optionally substituted 
with hydroxyl, amino, halo, phosphate or sulfate and/or may be optionally inserted with N, 
O, S, alkenyl or alkynyl. 

R 2 is hydrogen; C r C 100 alkyl, optionally substituted with hydroxyl, amino, halo, 
phosphate or sulfate and/or may be optionally inserted with N, O, S, alkenyl or alkynyl; or, 
when n is 1, R 2 may be taken together with with R4 or R 6 to form a ring of 5-12 atoms. 

R4 is hydrogen; C r Cioo alkyl, optionally substituted with hydroxyl, amino, halo, 
phosphate or sulfate and/or may be optionally inserted withN, O, S, alkenyl or alkynyl; or, 
when n is 1, R4 may be taken together with with R 2 or R 5 to form a ring of 5-12 atoms. 

R 5 is hydrogen, C1-C100 alkyl optionally substituted with hydroxyl, amino, halo, 
phosphate or sulfate and/or may be optionally inserted withN, O, S, alkenyl or alkynyl; or, 
when n is 1, R 5 may be taken together with with R, to form a ring of 5-12 atoms. 

R 6 is hydrogen, Ci-C 100 alkyl, optionally substituted with hydroxyl, amino, halo, ' 
phosphate or sulfate and/or may be optionally inserted withN, O, S, alkenyl or alkynyl, or, 
when n is 1, R 6 may be taken together with with R 2 to form a ring of 6-10 atoms; 

R 7 is hydrogen, C r C 100 alkyl, or C(0)(CH 2 ) q C(0)NHR 9 ; T is hydrogen or a 
functional group; n and q are each independently 1-100; R 8 is C,-Cio alkyl or C 6 -C 10 aryl; 
and R 9 is hydrogen, C1-C10 alkyl, C6-C10 aryl or a solid support agent. 

Preferred embodiments may include one of more of the following subsets of iRNA 
agent delivery modules. 

In one subset of RNAi agent delivery modules, A can be connected directly or 
indirectly through a terminal 3' or 5' ribose sugar carbon of the RNA agent. 

In another subset of RNAi agent delivery modules, X, W, and Z are O and Y is S. 
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In still yet another subset of RNAi agent delivery modules, n is 1, and R 2 and R 6 are 
taken together to form a ring containing six atoms and R4 and R5 are taken together to form a 
ring containing six atoms. Preferably, the ring system is a trans-dec&lw.. For example, the 
RNAi agent delivery module of this subset can include a compound of Formula (Y-l): 



The functional group can be, for example, a targeting group (e.g., a steroid or a 
carbohydrate), a reporter group (e.g., a fluorophore), or a label (an isotopically labelled 
moiety). The targeting group can further include protein binding agents, endothelial cell 
10 targeting groups (e.g., RGD peptides and mimetics), cancer cell targeting groups (e.g., folate 
Vitamin B12, Biotin), bone cell targeting groups (e.g., bisphosphonates, polyglutamates, 
polyaspartates), multivalent mannose (for e.g., macrophage testing), lactose, galactose, N- 
acetyl-galactosamine, monoclonal antibodies, glycoproteins, lectins, melanotropin, or 
thyrotropin. 

15 As can be appreciated by the skilled artisan, methods of synthesizing the compounds 

of the formulae herein will be evident to those of ordinary skill in the art.The synthesized 
compounds can be separated from a reaction mixture and further purified by a method such 
as column chromatography, high pressure liquid chromatography, or recrystallization. 
Additionally, the various synthetic steps may be performed in an alternate sequence or order 

20 to give the desired compounds. Synthetic chemistry transformations and protecting group 
methodologies (protection and deprotection) useful in synthesizing the compounds described 
herein are known in the art and include, for example, those such as described in R. Larock, 
Comprehensive Organic Transformations, VCH Publishers (1989); T.W. Greene and P.G.M. 
Wuts, Protective Groups in Organic Synthesis, 2d. Ed., John Wiley and Sons (1991); L. 

25 Fieser and M. Fieser, Fieser and Fieser's Reagents for Organic Synthesis, John Wiley and 
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Sons (1994); and L. Paquette, ed., Encyclopedia of Reagents for Organic Synthesis, John 
Wiley and Sons (1995), and subsequent editions thereof. 

Ribose Replacement Monomer Subunits 

5 iRNA agents can be modified in a number of ways which can optimize one or more 

characteristics of the iRNA agent. In one aspect, the invention features a ribose replacement 
monomer subunit (RRMS), or a an iRNA agent which incorporates a RRMS, such as those 
described herein and those described in one or more of United States Provisional Application 
Serial No. 60/493,986 (Attorney Docket No. 14174-079P01), filed on August 8, 2003, which 

10 is hereby incorporated by reference; United States Provisional Application Serial No. 

60/494,597 (Attorney Docket No. 14174-080P01), filed on August 1 1, 2003, which is hereby 
incorporated by reference; United States Provisional Application Serial No. 60/506,341 
(Attorney Docket No. 14174-080P02), filed on September 26, 2003, which is hereby 
incorporated by reference; and in United States Provisional Application Serial No. 

15 60/1 58,453 (Attorney Docket No. 14174-080P03), filed on November 7, 2003, which is 
hereby incorporated by reference. 

In addition, the invention includes iRNA agents having a RRMS and another element 
described herein. E.g., the invention includes an iRNA agent described herein, e.g., a 
palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 

20 which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent having 
an archtecture or structure described herein, an iRNA associated with an amphipathic 
delivery agent described herein, an iRNA associated with a drug delivery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates a RRMS. 

25 The ribose sugar of one or more ribonucleotide subunits of an iRNA agent can be 

replaced with another moiety, e.g., a non-carbohydrate (preferably cyclic) carrier. A 
ribonucleotide subunit in which the ribose sugar of the subunit has been so replaced is 
referred to herein as a ribose replacement modification subunit (RRMS). A cyclic carrier 
may be a carbocyclic ring system, i.e., all ring atoms are carbon atoms, or a heterocyclic ring 

30 system, i.e., one or more ring atoms may be a heteroatom, e.g., nitrogen, oxygen, sulfur. The 
cyclic carrier may be a monocyclic ring system, or may contain two or more rings, e.g. fused 
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rings. The cyclic carrier may be a fully saturated ring system, or it may contain one or more 
double bonds. 

The carriers further include (i) at least two "backbone attachment points" and (ii) at 
least one "tethering attachment point." A "backbone attachment point" as used herein refers 
5 to a functional group, e.g. a hydroxyl group, or generally, a bond available for, and that is 
suitable for incorporation of the carrier into the backbone, e.g., the phosphate, or modified 
phosphate, e.g., sulfur containing, backbone, of a ribonucleic acid. A "tethering attachment 
point" as used herein refers to a constituent ring atom of the cyclic carrier, e.g., a carbon 
atom or a heteroatom (distinct from an atom which provides a backbone attachment point), 

10 that connects a selected moiety. The moiety can be, e.g., a ligand, e.g., a targeting or 

delivery moiety, or a moiety which alters a physical property, e.g., lipophilicity, of an iRNA 
agent. Optionally, the selected moiety is connected by an intervening tether to the cyclic 
carrier. Thus, it will include a functional group, e.g., an amino group, or generally, provide a 
bond, that is suitable for incorporation or tethering of another chemical entity, e.g., a ligand 

15 to the constituent ring. 

Incorporation of one or more RRMSs described herein into an RNA agent, e.g., an 
iRNA agent, particularly when tethered to an appropriate entity, can confer one or more new 
properties to the RNA agent and/or alter, enhance or modulate one or more existing 
properties in the RNA molecule. E.g., it can alter one or more of lipophilicity or nuclease 

20 resistance. Incorporation of one or more RRMSs described herein into an iRNA agent can, 
particularly when the RRMS is tethered to an appropriate entity, modulate, e.g., increase, 
binding affinity of an iRNA agent to a target mRNA, change the geometry of the duplex 
form of the iRNA agent, alter distribution or target the iRNA agent to a particular part of the 
body, or modify the interaction with nucleic acid binding proteins (e.g., during RISC 

25 formation and strand separation). 

Accordingly, in one aspect, the invention features, an iRNA agent preferably 
comprising a first strand and a second strand, wherein at least one subunit having a formula 
(R-l) is incorporated into at least one of said strands. 
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R 1 R 6 




(R-l) 

Referring to formula (R-l), X is N(CO)R 7 , NR 7 or CH 2 ; Y is NR 8 , O, S, CR 9 R 10 , or 
5 absent; and Z is CR 1 *R 12 or absent. 

Each of R 1 , R 2 , R 3 , R 4 , R 9 , and R 10 is, independently, H, OR a , OR b , (CH 2 )„OR a , or 
(CH 2 ) n OR b 5 provided that at least one of R 1 , R 2 , R 3 , R 4 , R 9 , and R 10 is OR a or OR b and that at 
least one of R 1 , R 2 , R 3 , R 4 , R 9 , and R 10 is (CH 2 ) n OR a , or (CH 2 ) n OR b (when the RRMS is 
terminal, one of R 1 , R 2 , R 3 , R 4 , R 9 , and R 10 will include R a and one will include R b ; when the 
1 o RRMS is internal, two of R 1 , R 2 , R 3 , R 4 , R 9 , and R 10 will each include an R b ); further 

provided that preferably OR a may only be present with (CH 2 ) n OR b and (CH 2 ) n OR a may only 
be present with OR b . 

Each of R 5 , R 6 , R n , andR 12 is, independently, H, C r C 6 alkyl optionally substituted 
with 1-3 R 13 , or C(0)NHR 7 ; or R 5 and R n together are C 3 -C 8 cycloalkyl optionally 
1 5 substituted with R 14 . 

R 7 is C r C 20 alkyl substituted withNR c R d ; R 8 is C r C 6 alkyl; R 13 is hydroxy, C r C 4 
alkoxy, or halo; and R 14 is NR C R 7 . 



A 

II 

-P B 

I 

C 



; and 
R b is: 
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•Strand 



G 



Each of A and C is, independently, O or S. 
B is OH, CT, or 



5 



o 

I! 



O 

II 



O P O P OH 

I I 

O" O" 

R c is H or C 1 -C6 alkyl; R d is H or a ligand; and n is 1 -4. 

In a preferred embodiment the ribose is replaced with a pyrroline scaffold, and X is 
N(CO)R 7 or NR 7 , Y is CR 9 R 10 , and Z is absent. 



X is N(CO)R 7 or NR 7 , Y is CR 9 R 10 , and Z is CR n R 12 . 

In other preferred embodiments the ribose is replaced with a piperazine scaffold, and 
X is N(CO)R 7 or NR 7 , Y is NR 8 , and Z is CR n R 12 . 

In other preferred embodiments the ribose is replaced with a morpholino scaffold, and 
15 X is N(CO)R 7 or NR 7 , Y is O, and Z is CR n R 12 . 

In other preferred embodiments the ribose is replaced with a decalin scaffold, and X 
isCH 2 ; Y is CR 9 R 10 ; and Z is CR n R 12 ; and R 5 and R n together are C 6 cycloalkyl. 

In other preferred embodiments the ribose is replaced with a decalin/indane scafold 
and , and X is CH 2 ; Y is CR 9 R 10 ; and Z is CR n R 12 ; and R s and R n together are C 5 
20 cycloalkyl. 

In other preferred embodiments, the ribose is replaced with a hydroxyproline 
scaffold. 

RRMSs described herein may be incorporated into any double-stranded RNA-like 
molecule described herein, e.g., an iRNA agent. An iRNA agent may include a duplex 



10 



In other preferred embodiments the ribose is replaced with a piperidine scaffold, and 
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comprising a hybridized sense and antisense strand, in which the antisense strand and/or the 
sense strand may include one or more of the RRMSs described herein. An RRMS can be 
introduced at one or more points in one or both strands of a double-stranded iRNA agent. An 
RRMS can be placed at or near (within 1 , 2, or 3 positions) of the 3' or 5' end of the sense 
5 strand or at near (within 2 or 3 positions of) the 3' end of the antisense strand. In some 
embodiments it is preferred to not have an RRMS at or near (within 1, 2, or 3 positions of) 
the 5' end of the antisense strand. An RRMS can be internal, and will preferably be 
positioned in regions not critical for antisense binding to the target. 

In an embodiment, an iRNA agent may have an RRMS at (or within 1 , 2, or 3 

10 positions of) the 3' end of the antisense strand. In an embodiment, an iRNA agent may have 
an RRMS at (or within 1, 2, or 3 positions of) the 3' end of the antisense strand and at (or 
within 1, 2, or 3 positions of) the 3' end of the sense strand. In an embodiment, an iRNA 
agent may have an RRMS at (or within 1, 2, or 3 positions of) the 3' end of the antisense 
strand and an RRMS at the 5' end of the sense strand, in which both ligands are located at the 

1 5 same end of the iRNA agent. 

In certain embodiments, two ligands are tethered, preferably, one on each strand and 
are hydrophobic moieties. While not wishing to be bound by theory, it is believed that 
pairing of the hydrophobic ligands can stabilize the iRNA agent via intermolecular van der 
Waals interactions. 

20 In an embodiment, an iRNA agent may have an RRMS at (or within 1 , 2, or 3 

positions of) the 3' end of the antisense strand and an RRMS at the 5' end of the sense strand, 
in which both RRMSs may share the same ligand (e.g., cholic acid) via connection of their 
individual tethers to separate positions on the ligand. A ligand shared between two proximal 
RRMSs is referred to herein as a "hairpin ligand." 

25 In other embodiments, an iRNA agent may have an RRMS at the 3 ' end of the sense 

strand and an RRMS at an internal position of the sense strand. An iRNA agent may have an 
RRMS at an internal position of the sense strand; or may have an RRMS at an internal 
position of the antisense strand; or may have an RRMS at an internal position of the sense 
strand and an RRMS at an internal position of the antisense strand. 

30 In preferred embodiments the iRNA agent includes a first and second sequences, 

which are preferably two separate molecules as opposed to two sequences located on the 



108 



WO 2004/080406 



PCT/US2004/007070 



same strand, have sufficient complementarity to each other to hybridize (and thereby form a 
duplex region), e.g., under physiological conditions, e.g., under physiological conditions but 
not in contact with a helicase or other unwinding enzyme. 

It is preferred that the first and second sequences be chosen such that the ds iRNA 

5 agent includes a single strand or unpaired region at one or both ends of the molecule. Thus, a 
ds iRNA agent contains first and second sequences, preferable paired to contain an overhang, 
e.g., one or two 5 ' or 3 ' overhangs but preferably a 3' overhang of 2-3 nucleotides. Most 
embodiments will have a 3' overhang. Preferred sRNA agents will have single-stranded 
overhangs, preferably 3 5 overhangs, of 1 or preferably 2 or 3 nucleotides in length at each 

1 o end. The overhangs can be the result of one strand being longer than the other, or the result 
of two strands of the same length being staggered. 5' ends are preferably phosphorylated. 

An RNA agent, e.g., an iRNA agent, containing a preferred, but nonlimiting RRMS is 
presented as formula (R-2) in FIG. 4. The carrier includes two "backbone attachment points" 
(hydroxyl groups), a "tethering attachment point," and a ligand, which is connected indirectly 

1 5 to the carrier via an intervening tether. The RRMS may be the 5 ' or 3 ' terminal subunit of 
the RNA molecule, i.e„ one of the two T groups may be a hydroxyl group, and tire other 
"W" group may be a chain of two or more unmodified or modified ribonucleotides. 
Alternatively, the RRMS may occupy an internal position, and both " W" groups may be one 
or more unmodified or modified ribonucleotides. More than one RRMS may be present in a 

20 RNA molecule, e.g., an iRNA agent. 

The modified RNA molecule of formula (R-2) can be obtained using oligonucleotide 
synthetic methods known in the art. In a preferred embodiment, the modified RNA molecule 
of formula (II) can be prepared by incorporating one or more of the corresponding RRMS 
monomer compounds (RRMS monomers, see, e.g., A, B, and C in FIG. 4) into a growing 

25 sense or antisense strand, utilizing, e.g., phosphoramidite or H-phosphonate coupling 
strategies. 

The RRMS monomers generally include two differently functionalized hydroxyl 
groups (OFG 1 and OFG 2 above), which are linked to the carrier molecule (see A in FIG. 4), 
and a tethering attachment point. As used herein, the term "functionalized hydroxyl group" 
30 means that the hydroxyl proton has been replaced by another substituent. As shown in 

representative structures B and C, one hydroxyl group (OFG 1 ) on the carrier is functionalized 
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with a protecting group (PG). The other hydroxyl group (OFG 2 ) can be runctionalized with 
either (1) a liquid or solid phase synthesis support reagent (solid circle) directly or indirectly 
through a linker, L, as in B, or (2) a phosphorus-containing moiety, e.g., a phosphoramidite as 
in C. The tethering attachment point may be connected to a hydrogen atom, a tether, or a 
5 tethered ligand at the time that the monomer is incorporated into the growing sense or 
antisense strand (see R in Scheme 1). Thus, the tethered ligand can be, but need not be 
attached to the monomer at the time that the monomer is incorporated into the growing 
strand. In certain embodiments, the tether, the ligand or the tethered ligand may be linked to 
a "precursor" RRMS after a "precursor" RRMS monomer has been incorporated into the 
10 strand. 

The (OFG 1 ) protecting group may be selected as desired, e.g., from T.W. Greene and 
P.G.M. Wuts, Protective Groups in Organic Synthesis, 2d. Ed., John Wiley and Sons (1991). 
The protecting group is preferably stable under amidite synthesis conditions, storage 
conditions, and oligonucleotide synthesis conditions. Hydroxyl groups, -OH, are 

15 nucleophilic groups (i.e., Lewis bases), which react through the oxygen with electrophiles 
(i.e., Lewis acids). Hydroxyl groups in which the hydrogen has been replaced with a 
protecting group, e.g., a triarylmethyl group or a trialkylsilyl group, are essentially unreactive 
as nucleophiles in displacement reactions. Thus, the protected hydroxyl group is useful in 
preventing e.g., homocoupling of compounds exemplified by structure C during 

20 oligonucleotide synthesis. A preferred protecting group is the dimethoxytrityl group. 

When the OFG 2 in B includes a linker, e.g., a long organic linker, connected to a 
soluble or insoluble support reagent, solution or solid phase synthesis techniques can be 
employed to build up a chain of natural and/or modified ribonucleotides once OFG 1 is 
deprotected and free to react as a nucleophile with another nucleoside or monomer 

25 containing an electrophilic group (e.g., an amidite group). Alternatively, a natural or 

modified ribonucleotide or oligoribonucleotide chain can be coupled to monomer C via an 
amidite group or H-phosphonate group at OFG 2 . Subsequent to this operation, OFG 1 can be 
deblocked, and the restored nucleophilic hydroxyl group can react with another nucleoside or 
monomer containing an electrophilic group (see FIG. 1). R' can be substituted or 

30 imsubstituted alkyl or alkenyl. In preferred embodiments, R' is methyl, allyl or 2- 
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cyanoemyl. R' ' may a C r C 10 alkyl group, preferably it is a branched group containing three 
or more carbons, e.g., isopropyl. 

OFG 2 in B can be hydroxyl functionalized with a linker, which in turn contains a 
liquid or solid phase synthesis support reagent at the other linker terminus. The support 
5 reagent can be any support medium that can support the monomers described herein. The 
monomer can be attached to an insoluble support via a linker, L, which allows the monomer 
(and the growing chain) to be solubilized in the solvent in which the support is placed. The 
solubilized, yet immobilized, monomer can react with reagents in the surrounding solvent; 
unreacted reagents and soluble by-products can be readily washed away from the solid 
10 support to which the monomer or monomer-derived products is attached. Alternatively, the 
monomer can be attached to a soluble support moiety, e.g., polyethylene glycol (PEG) and 
liquid phase synthesis techniques can be used to build up the chain. Linker and support 
medium selection is within skill of the art. Generally the linker may be -C(0)(CH 2 ) q C(0)-, 
or -C(0)(CH 2 ) q S-, preferably, it is oxalyl, succinyl or thioglycolyl. Standard control pore 
15 glass solid phase synthesis supports can not be used in conjunction with fluoride labile 5' 
silyl protecting groups because the glass is degraded by fluoride with a significant reduction 
in the amount of full-length product. Fluoride-stable polystyrene based supports or PEG are 
preferred. 

Preferred carriers have the general formula (R-3) provided below. (In that structure 
20 preferred backbone attachment points can be chosen from R 1 or R 2 ; R 3 or R 4 ; or R 9 and R 10 if 
Y is CR^ 10 (two positions are chosen to give two backbone attachment points, e.g., R 1 and 
R 4 , or R 4 and R 9 . Preferred tethering attachment points include R 7 ; R 5 or R 6 when X is CH 2 . 
The carriers are described below as an entity, which can be incorporated into a strand. Thus, 
it is understood that the structures also encompass the situations wherein one (in the case of a 
25 terminal position) or two (in the case of an internal position) of the attachment points, e.g., R 1 
or R 2 ; R 3 or R 4 ; or R 9 or R 10 (when Y is CR 9 R 10 ), is connected to the phosphate, or modified 
phosphate, e.g., sulfur containing, backbone. E.g., one of the above-named R groups can be - 
CH2-, wherein one bond is connected to the carrier and one to a backbone atom, e.g., a 
linking oxygen or a central phosphorus atom.) 

30 
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R 1 R 6 




(R-3) 

5 

X is N(CO)R 7 3 NR 7 or CH 2 ; Y is NR 8 , O, S, CR 9 R 10 ; and Z is CR n R 12 or absent. 

Each of R 1 , R 2 , R 3 , R 4 , R 9 , and R 10 is, independently, H, OR a , or (CH 2 )„OR b , provided 
that at least two of R 1 , R 2 , R 3 , R 4 , R 9 , and R 10 are OR a and/or (CH 2 ) n OR b . 

Each of R 5 , R 6 , R n , and R 12 is, independently, a ligand, H, C r C 6 alkyl optionally 
10 substituted with 1 -3 R 13 , or C(0)NHR 7 ; or R 5 and R 11 together are C 3 -C 8 cycloalkyl 
optionally substituted with R 14 . 

R 7 is H, a ligand, or (VC20 alkyl substituted with NR c R d ; R 8 is H or Ci-C 6 alkyl; R 13 
is hydroxy, C1-C4 alkoxy, or halo; R 14 is NR C R 7 ; R 15 is Ci-C 6 alkyl optionally substituted 
with cyano, or C2-Q alkenyl; R 16 is Q-C10 alkyl; and R 17 is a liquid or solid phase support 
15 reagent. 

L is -C(0)(CH 2 ) q C(0)-, or -C(0)(CH 2 ) q S-; R a is CAr 3 ; R b is P(0)(0")H, 
P(OR ,5 )N(R 16 ) 2 or L-R 17 ; R c is H or C r C 6 alkyl; and R d is H or a ligand. 

Each Ar is, independently, C6-C10 aryl optionally substituted with C1-C4 alkoxy; n is 
1-4; and q is 0-4. 

20 Exemplary carriers include those in which, e.g., X is N(CO)R 7 or NR 7 , Y is CR 9 R 10 , 

and Z is absent; or X is N(CO)R 7 or NR 7 , Y is CR 9 R 10 , and Z is CR 1 [ R 12 ; or X is N(CO)R 7 or 
NR 7 , Y is NR 8 , and Z is CR n R 12 ; or X is N(CO)R 7 or NR 7 , Y is O, and Z is CR 1 ] R 12 ; or X is 
CH 2 ; Y is CR 9 R 10 ; Z is CR 1 ! R 12 , and R 5 and R u together form C 6 cycloalkyl (H, z = 2), or 
the indane ring system, e.g., X is CH 2 ; Y is CR 9 R 10 ; Z is CR U R 12 , and R 5 and R !1 together 

25 form C 5 cycloalkyl (H, z = 1). 
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In certain embodiments, the carrier may be based on the pyrroline ring system or the 
3-hydroxyproline ring system, e.g., X is N(CO)R 7 or NR. 7 , Y is CR 9 R 10 , and Z is absent (D). 
OFG 1 is preferably attached to a primary carbon, e.g., an exocyclic alkylene 



OFG 



C 4 -|-C3^CH 2 OFG 1 
N 



c; 



LIGAND 
D 



group, e.g., a methylene group, connected to one of the carbons in the five-membered ring (- 
CH2OFG 1 in D). OFG 2 is preferably attached directly to one of the carbons in the five- 
membered ring (-OFG 2 in D). For the pyrroline-based carriers, -CH 2 OFG ] may be attached 
to C-2 and OFG 2 may be attached to C-3; or -CH2OFG 1 may be attached to C-3 and OFG 2 
may be attached to C-4. . In certain embodiments, CH2OFG 1 and OFG 2 may be geminally 
substituted to one of the above-referenced carbons.For the 3-hydroxyproline-based carriers, - 
CHzOFG 1 may be attached to C-2 and OFG 2 may be attached to C-4. The pyrroline- and 3- 
hydroxyproline-based monomers may therefore contain linkages (e.g., carbon-carbon bonds) 
wherein bond rotation is restricted about that particular linkage, e.g. restriction resulting from 
the presence of a ring. Thus, CH 2 OFG' and OFG 2 may be cis or trans with respect to one 
another in any of the pairings delineated above Accordingly, all cis/trans isomers are 
expressly included. The monomers may also contain one or more asymmetric centers and 
thus occur as racemates and racemic mixtures, single enantiomers, individual diastereomers 
and diastereomeric mixtures. All such isomeric forms of the monomers are expressly 
included. The tethering attachment point is preferably nitrogen. 

In certain embodiments, the carrier may be based on the piperidine ring system (E), 
e.g., X is N(CO)R 7 or NR 7 , Y is CR 9 R 10 , and Z is CR n R 12 . OFG 1 is preferably 
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OFG 2 




-(CH 2 ) n OFG 1 



LIGAND 



E 



attached to a primary carbon, e.g., an exocyclic alkylene group, e.g., a methylene group (n=l) 
or ethylene group (n=2), connected to one of the carbons in the six-membered ring [- 

5 (CH 2 ) n OFG 1 in E]. OFG 2 is preferably attached directly to one of the carbons in the six- 
membered ring (-OFG 2 in E). -(CH 2 ) n OFG L and OFG 2 may be disposed in a geminal manner 
on the ring, i.e., both groups may be attached to the same carbon, e.g., at C-2, C-3, or C-4. 
Alternatively, -(CH 2 ) n OFG I and OFG 2 may be disposed in a vicinal manner on the ring, i.e., 
both groups may be attached to adjacent ring carbon atoms, e.g., -(CH^nOFG 1 may be 

10 attached to C-2 and OFG 2 may be attached to C-3; -(CH 2 ) n OFG ! may be attached to C-3 and 
OFG 2 may be attached to C-2; -(CH 2 ) n OFG 1 may be attached to C-3 and OFG 2 may be 
attached to C-4; or -(CH^OFG 1 may be attached to C-4 and OFG 2 may be attached to C-3. 
The piperidine-based monomers may therefore contain linkages (e.g., carbon-carbon bonds) 
wherein bond rotation is restricted about that particular linkage, e.g. restriction resulting from 

1 5 the presence of a ring. Thus, -(CH^nOFG 1 and OFG 2 may be cis or trans with respect to one 
another in any of the pairings delineated above. Accordingly, all cis/trans isomers are 
expressly included. The monomers may also contain one or more asymmetric centers and 
thus occur as racemates and racemic mixtures, single enantiomers, individual diastereomers 
and diastereomeric mixtures. All such isomeric forms of the monomers are expressly 

20 included. The tethering attachment point is preferably nitrogen. 
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In certain embodiments, the carrier may be based on the piperazine ring system (F), 
e.g., X is N(CO)R 7 or NR 7 , Y is NR 8 , and Z is CR U R 12 , or the morpholine ring system (G), 
e.g., X is N(CO)R 7 or NR 7 , Y is O, and Z is CR n R 12 OFG 1 is preferably 



FT 

| OFG 2 OFG 2 

K/ 

-j— CH 2 OFG 1 | -f— CH 2 OFG' 

N N 



LIGAND LIGAND 
F G 



attached to a primary carbon, e.g., an exocyclic alkylene group, e.g., a methylene group, 
connected to one of the carbons in the six-membered ring (-CH2OFG 1 in F or G). OFG 2 is 
preferably attached directly to one of the carbons in the six-membered rings (-OFG 2 in F or 

1 0 G). For both F and G, -CH 2 OFG' may be attached to C-2 and OFG 2 may be attached to C-3; 
or vice versa. In certain embodiments, CH2OFG 1 and OFG 2 may be geminally substituted to 
one of the above-referenced carbons. The piperazine- and morpholine-based monomers may 
therefore contain linkages (e.g., carbon-carbon bonds) wherein bond rotation is restricted 
about that particular linkage, e.g. restriction resulting from the presence of a ring. Thus, 

15 CH2OFG 1 and OFG 2 may be cis or trans with respect to one another in any of the pairings 
delineated above. Accordingly, all cis/trans isomers are expressly included. The monomers 
may also contain one or more asymmetric centers and thus occur as racemates and racemic 
mixtures, single enantiomers, individual diastereomers and diastereomeric mixtures. All 
such isomeric forms of the monomers are expressly included. R'" can be, e.g., C r C 6 alkyl, 

20 preferably CH 3 . The tethering attachment point is preferably nitrogen in both F and G. 
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In certain embodiments, the carrier may be based on the decalin ring system, e.g., 3 
is CH 2 ; Y is CR 9 R 10 ; Z is CR n R 12 , and R 5 and R n together form C 6 cycloalkyl (H, z = 2), 

y- rT r.Yi, tr 9 !? 10 - 7 is CR U R 12 , and R 5 and R u together 
the indane ring system, e.g., X is CH 2 , x is uk k , \t> ^ iv , « 

form C 5 cycloalkyl (H, z = 1). OFG 1 is preferably attached to a primary carbon, 



OFG ; 



■or ? 4 

I -j— (CH 2 )„OFG' 

-C, /C 3 
u 2 



e.g., an exocyclic methylene group (n=l) or ethylene group (n=2) connected to one of C-2, 
C-3, C-4, or C-5 [-(CH^OFG 1 in H]. OFG 2 is preferably attached directly to one of C-2, C- 
3, C-4, or C-5 (-OFG 2 in H). -(CH 2 )„OFG 1 and OFG 2 may be disposed in a geminal manner 
on the ring, i.e., both groups may be attached to the same carbon, e.g., at C-2, C-3, C-4, or C- 
5. Alternatively, -(CH^OFG 1 and OFG 2 may be disposed in a vicinal manner on the ring, 
i.e., both groups may be attached to adjacent ring carbon atoms, e.g., -(CH 2 ) n OFG may be 
attached to C-2 and OFG 2 may be attached to C-3; -(CH^OFG 1 may be attached to C-3 and 
OFG 2 may be attached to C-2; -(CH^OFG 1 may be attached to C-3 and OFG 2 may be 
attached to C-4; or -(CH^OFG 1 may be attached to C-4 and OFG 2 may be attached to C-3; - 
(CH 2 ) n OFG 1 may be attached to C-4 and OFG 2 may be attached to C-5; or -(CH^OFG 1 may 
be attached to C-5 and OFG 2 may be attached to C-4. Hie decalin or indane-based 
monomers may therefore contain linkages (e.g., carbon-carbon bonds) wherein bond rotation 
is restricted about that particular linkage, e.g. restriction resulting from the presence of a ring. 
Thus, -(CH 2 ) n OFG 1 and OFG 2 may be cis or trans with respect to one another in any of the 
pairings delineated above. Accordingly, all cis/trans isomers are expressly included. The 
monomers may also contain one or more asymmetric centers and thus occur as racemates and 
racemic mixtures, single enantiomers, individual diastereomers and diastereomeric mixtures. 
All such isomeric forms of the monomers are expressly included. In a preferred 
embodiment, the substituents at C-l and C-6 are trans with respect to one another. The 
tethering attachment point is preferably C-6 or C-7. 
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Other carriers may include those based on 3-hydroxyproline (J). Thus, -(CH 2 ) n OFG 1 an 
OFG 2 may be cis or trans with respect to one another. Accordingly, all cis/trans isomers 
expressly included. The monomers may also contain one or more asymmetric centers 



and thus occur as racemates and racemic mixtures, single enantiomers, individual 
diastereomers and diastereomeric mixtures. All such isomeric forms of the monomers are 
expressly included. The tethering attachment point is preferably nitrogen. 
Representative carriers are shown in FIG. 5. 

In certain embodiments, a moiety, e.g., a ligand may be connected indirectly to the 
carrier via the intermediacy of an intervening tether. Tethers are connected to the carrier at 
the tethering attachment point (TAP) and may include any C1-C100 carbon-containing moiety, 
(e.g. Ci-C 75 , C1-C50, C r C 20 , C^Cio, Ci-Ce), preferably having at least one nitrogen atom. In 
preferred embodiments, the nitrogen atom forms part of a terminal amino group on the tether, 
which may serve as a connection point for the ligand. Preferred tethers (underlined) include 
TAP -CCHiySfflb ; TAP- CfO¥CH,)nNH 7 , ; or TAP- NR""rCH?)nNH ? , , in which n is 1-6 and 
R"" is C r C 6 alkyl. and R d is hydrogen or a ligand. In other embodiments, the nitrogen may 
form part of a terminal oxyamino group, e.g., -ONH 2 , or hydrazino group, -NHNH 2 . The 
tether may optionally be substituted, e.g., with hydroxy, alkoxy, perhaloalkyl, and/or 
optionally inserted with one or more additional heteroatoms, e.g., N, O, or S. Preferred 
tethered ligands may include, e.g., TAPrlCjja^NHfLIGAND), 
TAP- C(0)(CH? )n NH(LIGAND\ or TAP-TjR^(CIfc}nNH(LIGAND); 
TAP-(CH ? \ONHrLIGAND), TAP-CfOyCHA ONHfLIGAND), or 
TAP- NR' ! ! ! (CH? y ONHfLIGAND); TAP-fCH 2 \NHNH ? (LIGAND) , 
TAP-C(Q)(CH 2 \ NHNH 2 (LIGAND) , or TAP-NRlZ£CH 2 ]nNHNH z (LIGAND). 




LIGAND 



J 
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In other embodiments the tether may include an electrophilic moiety, preferably at the 
terminal position of the tether. Preferred electrophilic moieties include, e.g., an aldehyde, 
alkyl halide, mesylate, tosylate, nosylate, or brosylate, or an activated carboxylic acid ester, 
e.g. an NHS ester, or a pentafluorophenyl ester. Preferred tethers (underlined) include TAP- 

5 (CH ACHO; TAP- C(OyCH z 1 nCHO ; or TAP -NR^'YCHACHO, in which nis 1-6 and R"" 
is C,-C 6 alkyl; or TAP -fCH^InCrOQNHS; TAP- qOYCHi^CfCTlONHS ; or 
TAP- NR""(CH? ) n C(Q)QNHS, in which n is 1-6 and R"" is C r C 6 alkyl; 
TAP -rCH^nCfOOC^ ; TAP- C(Q¥CHV ) X(O) OC fi F s ; or TAP- NR ,, "(CH? ) nC(01 OC fi Fs, 
in which n is 1 -6 and R" " is C r C 6 alkyl; or -fCHACFkLG; TAP-Cf Q)( CH 7 ) n CH 2 LG; or 

1 o TAP- NR' ' ' ' (CHV )n CH ? LG , in which n is 1-6 and R' " s is C r C 6 alkyl (LG can be a leaving 
group, e.g., halide, mesylate, tosylate, nosylate, brosylate). Tethering can be carried out by 
coupling a nucleophilic group of a ligand, e.g., a thiol or amino group with an electrophilic 
group on the tether. 

15 Tethered Entities 

A wide variety of entities can be tethered to an iRNA agent, e.g., to the carrier of an 
RRMS. Examples are described below in the context of an RRMS but that is only preferred, 
entities can be coupled at other points to an iRNA agent. 

Preferred moieties are ligands, which are coupled, preferably covalently, either 

20 directly or indirectly via an intervening tether, to the RRMS carrier. In preferred 

embodiments, the ligand is attached to the carrier via an intervening tether. As discussed 
above, the ligand or tethered ligand may be present on the RRMS monomer when the RRMS 
monomer is incorporated into the growing strand. In some embodiments, the ligand may be 
incorporated into a "precursor" RRMS after a "precursor" RRMS monomer has been 

25 incorporated into the growing strand. For example, an RRMS monomer having, e.g., an 
amino-terminated tether (i.e., having no associated ligand), e.g., TAP-(CH2) n NH 2 may be 
incorporated into a growing sense or antisense strand. In a subsequent operation, i.e., after 
incorporation of the precursor monomer into the strand, a ligand having an electrophilic 
group, e.g., a pentafluorophenyl ester or aldehyde group, can subsequently be attached to the 

30 precursor RRMS by coupling the electrophilic group of the ligand with the terminal 
nucleophilic group of the precursor RRMS tether. 
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In preferred embodiments, a ligand alters the distribution, targeting or lifetime of an 
iRNA agent into which it is incorporated. In preferred embodiments a ligand provides an 
enhanced affinity for a selected target, e.g, molecule, cell or cell type, compartment, e.g., a 
cellular or organ compartment, tissue, organ or region of the body, as, e.g., compared to a 
species absent such a ligand. Preferred ligands will not take part in duplex pairing in a 
duplexed nucleic acid. 

Preferred ligands can improve transport, hybridization, and specificity properties and 
may also improve nuclease resistance of the resultant natural or modified 
oligoribonucleotide, or a polymeric molecule comprising any combination of monomers 
described herein and/or natural or modified ribonucleotides. 

Ligands in general can include therapeutic modifiers, e.g., for enhancing uptake; 
diagnostic compounds or reporter groups e.g., for monitoring distribution; cross-linking 
agents; and nuclease-resistance conferring moieties. General examples include lipids, 
steroids, vitamins, sugars, proteins, peptides, polyamines, and peptide mimics. 

Ligands can include a naturally occurring substance, such as a protein (e.g., human 
serum albumin (HSA), low-density lipoprotein (LDL), or globulin); carbohydrate (e.g., a 
dextran, pullulan, chitin, chitosan, inulin, cyclodextrin or hyaluronic acid); or a lipid. The 
ligand may also be a recombinant or synthetic molecule, such as a synthetic polymer, e.g., a 
synthetic polyamino acid. Examples of polyamino acids include polyamino acid is a 
polylysine (PLL), poly L-aspartic acid, poly L-glutamic acid, styrene-maleic acid anhydride 
copolymer, poly(L-lactide-co-glycolied) copolymer, divinyl ether-maleic anhydride 
copolymer, N-(2-hydroxypropyl)methacrylamide copolymer (HMPA), polyethylene glycol 
(PEG), polyvinyl alcohol (PVA), polyurethane, poly(2-ethylacryllic acid), N- 
isopropylacrylamide polymers, or polyphosphazine. Example of polyamines include: 
polyethylenimine, polylysine (PLL), spermine, spermidine, polyamine, pseudopeptide- 
polyamine, peptidomimetic polyamine, dendrimer polyamine, arginine, amidine, protamine, 
cationic lipid, cationic porphyrin, quaternary salt of a polyamine, or an alpha helical peptide. 

Ligands can also include targeting groups, e.g., a cell or tissue targeting agent, e.g., a 
lectin, glycoprotein, lipid or protein, e.g., an antibody, that binds to a specified cell type such 
as a cancer cell, endothelial cell, bone cell. A targeting group can be a thyrotropin, 
melanotropin, lectin, glycoprotein, surfactant protein A, Mucin carbohydrate, multivalent 
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lactose, multivalent galactose, N-acetyl-galactosamine, N-acetyl-gulucosamine multivalent 
mannose, multivalent fucose, glycosylated polyaminoacids, multivalent galactose, 
transferrin, bisphosphonate, polyglutamate, polyaspartate, a lipid, cholesterol, a steroid, bile 
acid, folate, vitamin B12, biotin, or an RGD peptide or RGD peptide mimetic. 

Other examples of ligands include dyes, intercalating agents (e.g. acridines), cross- 
linkers (e.g. psoralene, mitomycin C), porphyrins (TPPC4, texaphyrin, Sapphyrin), 
polycyclic aromatic hydrocarbons (e.g., phenazine, dihydrophenazine), artificial 
endonucleases (e.g. EDTA), lipophilic molecules, e.g, cholesterol, cholic acid, adamantane 
acetic acid, 1-pyrene butyric acid, dihydrotestosterone, l,3-Bis-0(hexadecyl)glycerol, 
geranyloxyhexyl group, hexadecylglycerol, borneol, menthol, 1,3-propanediol, heptadecyl 
group, palmitic acid, myristic acid,03-(oleoyl)lithocholic acid, 03-(oleoyl)cholenic acid, 
dimethoxytrityl, or phenoxazine)and peptide conjugates (e.g., antennapedia peptide, Tat 
peptide), alkylating agents, phosphate, amino, mercapto, PEG (e.g., PEG-40K), MPEG, 
[MPEG] 2 , polyamino, alkyl, substituted alkyl, radiolabeled markers, enzymes, haptens (e.g. 
biotin), transport/absorption facilitators (e.g., aspirin, vitamin E, folic acid), synthetic 
ribonucleases (e.g., imidazole, bisimidazole, histamine, imidazole clusters, acridine- 
imidazole conjugates, Eu3+ complexes of tetraazamacrocycles), dinitrophenyl, HRP, or AP. 

Ligands can be proteins, e.g., glycoproteins, or peptides, e.g., molecules having a 
specific affinity for a co-ligand, or antibodies e.g., an antibody, that binds to a specified cell 
type such as a cancer cell, endothelial cell, or bone cell. Ligands may also include hormones 
and hormone receptors. They can also include non-peptidic species, such as lipids, lectins,, 
carbohydrates, vitamins, cofactors, multivalent lactose, multivalent galactose, N-acetyl- 
galactosamine, N-acetyl-gulucosamine multivalent mannose, or multivalent fucose. The 
ligand can be, for example, a lipopolysaccharide, an activator of p38 MAP kinase, or an 
activator of NF-kB. 

The ligand can be a substance, e.g, a drug, which can increase the uptake of the iRNA 
agent into the cell, for example, by disrupting the cell's cytoskeleton, e.g., by disrupting the 
cell's microtubules, microfilaments, and/or intermediate filaments. The drug can be, for 
example, taxon, vincristine, vinblastine, cytochalasin, nocodazole, japlakinolide, latrunculin 
A, phalloidin, swinholide A, indanocine, or myoservin. 
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The ligand can increase the uptake of the iRNA agent into the cell by activating an 
inflammatory response, for example. Exemplary ligands that would have such an effect 
include tumor necrosis factor alpha (TNFalpha), interleukin-1 beta, or gamma interferon. 

In one aspect, the ligand is a lipid or lipid-based molecule. Such a lipid or lipid- 
5 based molecule preferably binds a serum protein, e.g., human serum albumin (HSA). An 
HSA binding ligand allows for distribution of the conjugate to a target tissue, e.g., a non- 
kidney target tissue of the body. Preferably, the target tissue is the liver, preferably 
parenchymal cells of the liver. Other molecules that can bind HSA can also be used as 
ligands. For example, neproxin or aspirin can be used. A lipid or lipid-based ligand can (a) 
10 increase resistance to degradation of the conjugate, (b) increase targeting or transport into a 
target cell or cell membrane, and/or (c) can be used to adjust binding to a seru protein, e.g., 
HSA. 

A lipid based ligand can be used to modulate, e.g., control the binding of the 
conjugate to a target tissue. For example, a lipid or lipid-based ligand that binds to HSA 
15 more strongly will be less likely to be targeted to the kidney and therefore less likely to be 
cleared from the body. A lipid or lipid-based ligand that binds to HSA less strongly can be 
used to target the conjugate to the kidney. 

In a preferred embodiment, the lipid based ligand binds HSA. Preferably, it binds 
HSA with a sufficient affinity such that the conjugate will be preferably distributed to a non- 
20 kidney tissue. However, it is preferred that the affinity not be so strong that the HSA-ligand 
binding cannot be reversed. 

In another preferred embodiment, the lipid based ligand binds HSA weakly or not at 
all, such that the conjugate will be preferably distributed to the kidney. Other moieties that 
target to kidney cells can also be used in place of or in addition to the lipid based ligand. 
25 In another aspect, the ligand is a moiety, e.g., a vitamin, which is taken up by a target 

cell, e.g., a proliferating cell. These are particularly useful for treating disorders 
characterized by unwanted cell proliferation, e.g., of the malignant or non-malignant type, 
e.g., cancer cells. Exemplary vitamins include vitamin A, E, and K. Other exemplary 
vitamins include are B vitamin, e.g., folic acid, B12, riboflavin, biotin, pyridoxal or other 
30 vitamins or nutrients taken up by cancer cells. Also included are HSA and low density 
lipoprotein (LDL). 
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In another aspect, the ligand is a cell-permeation agent, preferably a helical cell- 
permeation agent. Preferably, the agent is amphipathic. An exemplary agent is a peptide 
such as tat or antennopedia. If the agent is a peptide, it can be modified, including a 
peptidylmimetic, invertomers, non-peptide or pseudo-peptide linkages, and use of D-amino 

5 acids. The helical agent is preferably an alpha-helical agent, which preferably has a 
lipophilic and a lipophobic phase. 

The ligand can be a peptide or peptidomimetic. A peptidomimetic (also referred to 
herein as an oligopeptidomimetic) is a molecule capable of folding into a defined three- 
dimensional structure similar to a natural peptide. The attachment of peptide and 

10 peptidomimetics to iRNA agents can affect pharmacokinetic distribution of the iRNA, such 
as by enhancing cellular recognition and absorption. The peptide or peptidomimetic moiety 
can be about 5-50 amino acids long, e.g., about 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 amino 
acids long (see Table 1, for example). 
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Table 1 . Exemplary Cell Permeation Peptides 



Cell 
Permeation 
Peptide 


Amino acid Sequence 


Reference 


Penetratin 


RQIKIWFQNRRMKWKK (SEQ IDNO:6737) 


Derossi et ah, J. Biol. 
Chem. 269:10444, 
1994 


Tat fragment 
(48-60) 


GRKKRRQRRRPPQC (SEQIDNO:6738) 


Vives et ah, J. Biol. 
Chem., 272:16010, 
1997 


Signal 
Sequence- 
based peptide 


GALFLGWLGAAGSTMGAWS QPKKKRKV 
(SEQ ID NO:6738) 


Chaloin et ah, 

Res. Commun., 
243:601 1998 


PVEC 


LLIILRRRIRKQAHAHSK (SEQ IDNO:6739) 


Elmquist et ah, Exp. 
Cell Res., 269:237, 
2001 


Transportan 


GWTLNSAGYLLKINLKALAALAKKIL 
(SEQIDNO:6740) 


Pooga et ah, FASEB 
J., 12:67, 1998 


Ampliiphilic 
model peptide 


KLALKLALKALKAALKLA (SEQ ID 
NO.-6741) 


Oehlke et ah Mol. 
Ther., 2:339, 2000 




RRRPvRRRPvR (SEQ ID NO.6742) 


Mitchell et al J 
Pept. Res., 56:318, 
2000 


Bacterial cell 
wall 

permeating 


KFFKFFKFFK (SEQ ID NO:6743) 




LL-37 


LLGDFFRKSKEKJGKEFKETVQRJKDFLRN 
LVPRTES (SEQ ID NO: 6744) 




CecropinPl 


SWLSKTAKKLENSAKKRISEGIAIAIQGGP 
R (SEQ ID NO:6745) 




a-defensin 


ACYCRIPACIAGERRYGTCIYQGRLWAFC 
C (SEQIDNO:6746) 




b-defensin 


DHYNC VS S GGQCLYS ACP1FTKIQGTC YR 
GKAKCCK (SEQIDNO:6747) 




Bactenecin 


RKCRIVVIRVCR (SEQ ID NO:6748) 




PR-39 


RRRPRPPYLPRPRPPPFFPPRLPPRIPPGFPP 
RFPPRFPGKR-NH2 (SEQ ID NO:6749) 




Indolicidin 


ILPWKWPWWPWRR-NH2 (SEQ ID 
NO: 6750) 





A peptide or peptidomirnetic can be, for example, a cell permeation peptide, cationic 
peptide, amphipathic peptide, or hydrophobic peptide (e.g., consisting primarily of Tyr, Trp 
or Phe). The peptide moiety can be a dendrimer peptide, constrained peptide or crosslinked 
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peptide. In another alternative, the peptide moiety can include a hydrophobic membrane 
translocation sequence (MTS). An exemplary hydrophobic MTS-containing peptide is 
RFGF having the amino acid sequence AAVALLPAVLLALLAP (SEQ ID NO:6751). An 
RFGF analogue (e.g., amino acid sequence AALLPVLLAAP (SEQ ID NO:6752)) 
5 containing a hydrophobic MTS can also be a targeting moiety. The peptide moiety can be a 
"delivery" peptide, which can carry large polar molecules including peptides, 
oligonucleotides, and protein across cell membranes. For example, sequences from the HTV 
Tat protein (GRKKRRQRRRPPQ (SEQ ID NO:6753)) and the Drosophila Antennapedia 
protein (RQIKIWFQNRRMKWKK (SEQ ID NO:6754)) have been found to be capable of 

10 functioning as delivery peptides. A peptide or peptidomimetic can be encoded by a random 
sequence of DNA, such as a peptide identified from a phage-display library, or one-bead- 
one-compound (OBOC) combinatorial library (Lam et al, Nature, 354:82-84, 1991). 
Preferably the peptide or peptidomimetic tethered to an iRNA agent via an incorporated 
monomer unit is a cell targeting peptide such as an arginine-glycine-aspartic acid (RGD)- 

15 peptide, or RGD mimic. A peptide moiety can range in length from about 5 amino acids to 
about 40 amino acids. The peptide moieties can have a structural modification, such as to 
increase stability or direct conformational properties. Any of the structural modifications 
described below can be utilized. 

An RGD peptide moiety can be used to target a tumor cell, such as an endothelial 

20 tumor cell or a breast cancer tumor cell (Zitzmann et ah, Cancer Res ., 62 : 5 1 3 9-43 , 2002) . 
An RGD peptide can facilitate targeting of an iRNA agent to tumors of a variety of other 
tissues, including the lung, kidney, spleen, or liver (Aoki et al, Cancer Gene Therapy 8:783- 
787, 2001). The RGD peptide can be linear or cyclic, and can be modified, e.g., glycosylated 
or methylated to facilitate targeting to specific tissues. For example, a glycosylated RGD 

25 peptide can deliver an iRNA agent to a tumor cell expressing a v B 3 (Haubner et al, Jour. 
Nucl. Med., 42:326-336, 2001). 

Peptides that target markers enriched in proliferating cells can be used. E.g., RGD 
containing peptides and peptidomimetics can target cancer cells, in particular cells that 
exhibit an a v p3 integrin. Thus, one could use RGD peptides, cyclic peptides containing 

30 RGD, RGD peptides that include D-amino acids, as well as synthetic RGD mimics. In 

addition to RGD, one can use other moieties that target the a v -(3 3 integrin ligand. Generally, 
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such ligands can be used to control proliferating cells and angiogeneis. Preferred conjugates 
of this type include an iRNA agent that targets PECAM-1, VEGF, or other cancer gene, e.g., 
a cancer gene described herein. 

A "cell permeation peptide" is capable of permeating a cell, e.g., a microbial cell, 
5 such as- a bacterial or fungal cell, or a mammalian cell, such as a human cell. A microbial 
cell-permeating peptide can be, for example, an a-helical linear peptide (e.g., LL-37 or 
Ceropin PI), a disulfide bond-containing peptide (e.g., a -defensin, [3-defensin or bactenecin), 
or a peptide containing only one or two dominating amino acids (e.g., PR-39 or indolicidin). 
A cell permeation peptide can also include a nuclear localization signal (NLS). For example, 

1 0 a cell permeation peptide can be a bipartite amphipathic peptide, such as MPG, which is 

derived from the fusion peptide domain of HIV- 1 gp41 and the NLS of SV40 large T antigen 
(Simeoni et al, Nucl. Acids Res. 31 :2717-2724, 2003). 

In one embodiment, a targeting peptide tethered to an RRMS can be an amphipathic 
a-helical peptide. Exemplary amphipathic a-helical peptides include, but are not limited to, 

15 cecropins, lycotoxins, paradaxins, buforin, CPF, bombinin-like peptide (BLP), cathelicidins, 
ceratotoxins, S. clava peptides, hagfish intestinal antimicrobial peptides (HFIAPs), 
magainines, brevinins-2, dermaseptins, melittins, pleurocidin, H 2 A peptides, Xenopus 
peptides, esculentinis-1, and caerins. A number of factors will preferably be considered to 
maintain the integrity of helix stability. For example, a maximum number of helix 

20 stabilization residues will be utilized (e.g., leu, ala, or lys), and a minimum number helix 
destabilization residues will be utilized (e.g., proline, or cyclic monomeric units. The 
capping residue will be considered (for example Gly is an exemplary N-capping residue 
and/or C-terminal amidation can be used to provide an extra H-bond to stabilize the helix. 
Formation of salt bridges between residues with opposite charges, separated by i ± 3, or i ± 4 

25 positions can provide stability. For example, cationic residues such as lysine, arginine, 
homo-arginine, ornithine or histidine can form salt bridges with the anionic residues 
glutamate or aspartate. 

Peptide and petidomimetic ligands include those having naturally occurring or 
modified peptides, e.g., D or L peptides; a, p\ or y peptides; N-methyl peptides; azapeptides; 

30 peptides having one or more amide, i.e., peptide, linkages replaced with one or more urea, 
thiourea, carbamate, or sulfonyl urea linkages; or cyclic peptides. 
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Methods for making iRNA agents 

iRNA agents can include modified or non-naturally occuring bases, e.g., bases 
described in copending and coowned United States Provisional Application Serial No. 
60/463,772 (Attorney Docket No. 14174-070P01), filed on April 17, 2003, which is hereby 
incorporated by reference and/or in copending and coowned United States Provisional 
Application Serial No. 60/465,802 (Attorney Docket No. 14174-074P01), filed on April 25, 
2003, which is hereby incorporated by reference. Monomers and iRNA agents which include 
such bases can be made by the methods found in United States Provisional Application Serial 
No. 60/463,772 (Attorney Docket No. 14174-070P01), filed on April 17, 2003, and/or in 
United States Provisional Application Serial No. 60/465,802 (Attorney Docket No. 14174- 
074P01), filed on April 25, 2003. 

In addition, the invention includes iRNA agents having a modified or non-naturally 
occuring base and another element described herein. E.g., the invention includes an iRNA 
agent described herein, e.g., a palindromic iRNA agent, an iRNA agent having a non 
canonical pairing, an iRNA agent which targets a gene described herein, e.g., a gene active in 
the liver, an iRNA agent having an architecture or structure described herein, an iRNA 
associated with an amphipathic delivery agent described herein, an iRNA associated with a 
drug delivery module described herein, an iRNA agent administered as described herein, or 
an iRNA agent formulated as described herein, which also incorporates a modified or non- 
naturally occuring base. 

The synthesis and purification of oligonucleotide peptide conjugates can be 
performed by established methods. See, for example, Trufert et al, Tetrahedron, 52:3005, 
1996; and Manoharan, "Oligonucleotide Conjugates in Antisense Technology," in Antisense 
Drug Technology , ed. S.T. Crooke, Marcel Dekker, Inc., 2001. 

In one embodiment of the invention, a peptidomimetic can be modified to create a 
constrained peptide that adopts a distinct and specific preferred conformation, which can 
increase the potency and selectivity of the peptide. For example, the constrained peptide can 
be an azapeptide (Gante, Synthesis, 405-413, 1989). An azapeptide is synthesized by 
replacing the a-carbon of an amino acid with a nitrogen atom without changing the structure 
of the amino acid side chain. For example, the azapeptide can be synthesized by using 
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hydrazine in traditional peptide synthesis coupling methods, such as by reacting hydrazine 
with a "carbonyl donor," e.g., phenylchloroformate. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be an N-methyl peptide. N-methyl peptides are 

5 composed of N-methyl amino acids, which provide an additional methyl group in the 

peptide backbone, thereby potentially providing additional means of resistance to proteolytic 
cleavage. N-methyl peptides can by synthesized by methods known in the art (see, for 
example, Lindgren et al, Trends Pharmacol. Sci. 21:99, 2000; Cell Penetrating Peptides: 
Processes and Applications , Langel, ed., CRC Press, Boca Raton, FL, 2002; Fische et al, 

10 Bioconjugate. Chem. 12: 825, 2001; Wander et al, J. Am. Chem. Soc, 124:13382, 2002). 
For example, an Ant or Tat peptide can be an N-methyl peptide. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be a P-peptide. P-peptides form stable secondary 
structures such as helices, pleated sheets, turns and hairpins in solutions. Their cyclic 

15 derivatives can fold into nanotubes in the solid state. P-peptides are resistant to degradation 
by proteolytic enzymes. P-peptides can be synthesized by methods known in the art. For 
example, an Ant or Tat peptide can be a p-peptide. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be a oligocarbamate. Oligocarbamate peptides are 

20 internalized into a cell by a transport pathway facilitated by carbamate transporters. For 
example, an Ant or Tat peptide can be an oligocarbamate. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be an oligourea conjugate (or an oligothiourea 
conjugate), in which the amide bond of a peptidomimetic is replaced with a urea moiety. 

25 Replacement of the amide bond provides increased resistance to degradation by proteolytic 
enzymes, e.g., proteolytic enzymes in the gastrointestinal tract. In one embodiment, an 
oligourea conjugate is tethered to an iRNA agent for use in oral delivery. The backbone in 
each repeating unit of an oligourea peptidomimetic can be extended by one carbon atom in 
comparison with the natural amino acid. The single carbon atom extension can increase 

30 peptide stability and lipophilicity, for example. An oligourea peptide can therefore be 

advantageous when an iRNA agent is directed for passage through a bacterial cell wall, or 
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when an iRNA agent must traverse the blood-brain barrier, such as for the treatment of a 
neurological disorder. In one embodiment, a hydrogen bonding unit is conjugated to the 
oligourea peptide, such as to create an increased affinity with a receptor. For example, an 
Ant or Tat peptide can be an oligourea conjugate (or an oligothiourea conjugate). 

5 The siRNA peptide conjugates of the invention can be affiliated with, e.g., tethered 

to, RRMSs occurring at various positions on an iRNA agent. For example, a peptide can be 
terminally conjugated, on either the sense or the antisense strand, or a peptide can be 
bisconjugated (one peptide tethered to each end, one conjugated to the sense strand, and one 
conjugated to the antisense strand). In another option, the peptide can be internally 

1 o conjugated, such as in the loop of a short hairpin iRNA agent. In yet another option, the 
peptide can be affiliated with a complex, such as a peptide-carrier complex. 

A peptide-carrier complex consists of at least a carrier molecule, which can 
encapsulate one or more iRNA agents (such as for delivery to a biological system and/or a 
cell), and a peptide moiety tethered to the outside of the carrier molecule, such as for 

15 targeting the carrier complex to a particular tissue or cell type. A carrier complex can carry 
additional targeting molecules on the exterior of the complex, or fusogenic agents to aid in 
cell delivery. The one or more iRNA agents encapsulated within the carrier can be 
conjugated to lipophilic molecules, which can aid in the delivery of the agents to the interior 
of the carrier. 

20 A carrier molecule or structure can be, for example, a micelle, a liposome (e.g., a 

cationic liposome), a nanoparticle, a microsphere, or a biodegradable polymer. A peptide 
moiety can be tethered to the carrier molecule by a variety of linkages, such as a disulfide 
linkage, an acid labile linkage, a peptide-based linkage, an oxyamino linkage or a hydrazine 
linkage. For example, a peptide-based linkage can be a GFLG peptide. Certain linkages will 

25 have particular advantages, and the advantages (or disadvantages) can be considered 
depending on the tissue target or intended use. For example, peptide based linkages are 
stable in the blood stream but are susceptible to enzymatic cleavage in the lysosomes. 



Targeting 

30 The iRNA agents of the invention are particularly useful when targeted to the liver. 

An iRNA agent can be targeted to the liver by incorporation of an RRMS containing a ligand 
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that targets the liver. For example, a liver-targeting agent can be a lipophilic moiety. 
Preferred lipophilic moieties include lipid, cholesterols, oleyl, retinyl, or cholesteryl residues. 
Other lipophilic moieties that can function as liver-targeting agents include cholic acid, 
adamantane acetic acid, 1-pyrene butyric acid, dihydrotestosterone, 1,3-Bis- 
5 0(hexadecyl)glycerol, geranyloxyhexyl group, hexadecylglycerol, borneol, menthol, 1,3- 
propanediol, heptadecyl group, palmitic acid, myristic acid,03-(oleoyl)lithocholic acid, 03- 
(oleoyl)cholenic acid, dimethoxytrityl, or phenoxazine. 

An iRNA agent can also be targeted to the liver by association with a low-density 
lipoprotein (LDL), such as lactosylated LDL. Polymeric carriers complexed with sugar 

10 residues can also function to target iRNA agents to the liver. 

A targeting agent that incorporates a sugar, e.g., galactose and/or analogues thereof, is 
particularly useful. These agents target, in particular, the parenchymal cells of the liver. For 
example, a targeting moiety can include more than one or preferably two or three galactose 
moieties, spaced about 15 angstroms from each other. The targeting moiety can alternatively 

15 be lactose (e.g., three lactose moieties), which is glucose coupled to a galactose. The 

targeting moiety can also be N-Acetyl-Galactosamine, N-Ac-Glucosamine. A mannose or 
mannose-6-phosphate targeting moiety can be used for macrophage targeting. 

Conjugation of an iRNA agent with a serum albumin (SA), such as human serum 
albumin, can also be used to target the iRNA agent to the liver. 

20 An iRNA agent targeted to the liver by an RRMS targeting moiety described herein 

can target a gene expressed in the liver. For example, the iRNA agent can target 
p21(WAFl/DIPl), P27(KTP1), the a-fetoprotein gene, beta-catenin, or c-MET, such as for 
treating a cancer of the liver. In another embodiment, the iRNA agent can target apoB-100, 
such as for the treatment of an HDL/LDL cholesterol imbalance; dyslipidemias, e.g., familial 

25 combined hyperlipidemia (FCHL), or acquired hyperlipidemia; hypercholesterolemia; statin- 
resistant hypercholesterolemia; coronary artery disease (CAD); coronary heart disease 
(CHD); or atherosclerosis. In another embodiment, the iRNA agent can target forkhead 
homologue in rhabdomyosarcoma (FKHR); glucagon; glucagon receptor; glycogen 
phosphorylase; PPAR-Gamma Coactivator (PGC-1); Fructose-l,6-bisphosphatase; glucose- 

30 6-phosphatase; glucose-6-phosphate translocator; glucokinase inlubitory regulatory protein; 
or phosphoenolpyruvate carboxykinase (PEPCK), such as to inhibit hepatic glucose 
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production in a mammal, such as a human, such as for the treatment of diabetes. In another 
embodiment, an iRNA agent targeted to the liver can target Factor V, e.g., the Leiden Factor 
V allele, such as to reduce the tendency to form a blood clot. An iRNA agent targeted to the 
liver can include a sequence which targets hepatitis virus (e.g., Hepatitis A, B, C, D, E, F, G, 
5 or H). For example, an iRNA agent of the invention can target any one of the nonstructural 
proteins of HCV: NS3, 4A, 4B, 5A, or 5B. For the treatment of hepatitis B, an iRNA agent 
can target the protein X (HBx) gene, for example. 

Preferred ligands on RRMSs include folic acid, glucose, cholesterol, cholic acid, 
Vitamin E, Vitamin K, or Vitamin A. 

10 Definitions 

The term "halo" refers to any radical of fluorine, chlorine, bromine or iodine. 
The term "alkyl" refers to a hydrocarbon chain that may be a straight chain or 
branched chain, containing the indicated number of carbon atoms. For example, Ci-C I2 alkyl 
indicates that the group may have from 1 to 12 (inclusive) carbon atoms in it. The term 

15 "haloalkyl" refers to an alkyl in which one or more hydrogen atoms are replaced by halo, and 
includes alkyl moieties in which all hydrogens have been replaced by halo (e.g., 
perfluoroalkyl). Alkyl and haloalkyl groups may be optionally inserted with O, N, or S. The 
terms "aralkyl" refers to an alkyl moiety in which an alkyl hydrogen atom is replaced by an 
aryl group. Aralkyl includes groups in which more than one hydrogen atom has been 

20 replaced by an aryl group. Examples of "aralkyl" include benzyl, 9-fiuorenyl, benzhydryl, 
and trityl groups. 

The term "alkenyl" refers to a straight or branched hydrocarbon chain containing 2-8 
carbon atoms and characterized in having one or more double bonds. Examples of a typical 
alkenyl include, but not limited to, allyl, propenyl, 2-butenyl, 3-hexenyl and 3-octenyl 
25 groups. The term "alkynyl" refers to a straight or branched hydrocarbon chain containing 2-8 
carbon atoms and characterized in having one or more triple bonds. Some examples of a 
typical alkynyl are ethynyl, 2-propynyl, and 3-methylbutynyl, and propargyl. The sp 2 and 
sp 3 carbons may optionally serve as the point of attachment of the alkenyl and alkynyl 
groups, respectively. 
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The term "alkoxy" refers to an -O-alkyl radical. The term "aminoalkyl" refers to an 
alkyl substituted with an aminoThe term "mercapto" refers to an -SH radical. The term 
"thioalkoxy" refers to an -S-alkyl radical. 

The term "alkylene" refers to a divalent alkyl (i.e., -R-), e.g., -CH 2 -, -CH 2 CH 2 -, and - 
5 CH2CH2CH2-. The term "alkylenedioxo" refers to a divalent species of the structure -O-R- 
0-, in which R represents an alkylene. 

The term "aryl" refers to an aromatic monocyclic, bicyclic, or tricyclic hydrocarbon 
ring system, wherein any ring atom capable of substitution can be substituted by a 
substituent. Examples of aryl moieties include, but are not limited to, phenyl, naphthyl, and 
10 anthracenyl. 

The term "cycloalkyl" as employed herein includes saturated cyclic, bicyclic, 
tricyclic,or polycyclic hydrocarbon groups having 3 to 12 carbons, wherein any ring atom 
capable of substitution can be substituted by a substituent. The cycloalkyl groups herein 
described may also contain fused rings. Fused rings are rings that share a common carbon- 

15 carbon bond. Examples of cycloalkyl moieties include, but are not limited to, cyclohexyl, 
adamantyl, and norbornyl. 

The term "heterocyclyl" refers to a nonaromatic 3-10 membered monocyclic, 8-12 
membered bicyclic, or 1 1-14 membered tricyclic ring system having 1-3 heteroatoms if 
monocyclic, 1-6 heteroatoms if bicyclic, or 1-9 heteroatoms if tricyclic, said heteroatoms 

20 selected from O, N, or S (e.g., carbon atoms and 1-3, 1-6, or 1-9 heteroatoms of N, O, or S if 
monocyclic, bicyclic, or tricyclic, respectively), wherein any ring atom capable of 
substitution can be substituted by a substituent. The heterocyclyl groups herein described 
may also contain fused rings. Fused rings are rings that share a common carbon-carbon 
bond. Examples of heterocyclyl include, but are not limited to tetrahydrofuranyl, 

25 tetrahydropyranyl, piperidinyl, morpholino, pyrrolinyl and pyrrolidinyl. 

The term "heteroaryl" refers to an aromatic 5-8 membered monocyclic, 8-12 
membered bicyclic, or 1 1-14 membered tricyclic ring system having 1-3 heteroatoms if 
monocyclic, 1-6 heteroatoms if bicyclic, or 1-9 heteroatoms if tricyclic, said heteroatoms 
selected from O, N, or S (e.g., carbon atoms and 1-3, 1-6, or 1-9 heteroatoms of N, O, or S if 

30 monocyclic, bicyclic, or tricyclic, respectively), wherein any ring atom capable of 
substitution can be substituted by a substituent. 
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The term "oxo" refers to an oxygen atom, which forms a carbonyl when attached to 
carbon, anN-oxide when attached to nitrogen, and a sulfoxide or sulfone when attached to 
sulfur. 

The term "acyl" refers to an alkylcarbonyl, cycloalkylcarbonyl, arylcarbonyl, 
5 heterocyclylcarbonyl, or heteroarylcarbonyl substituent, any of which may be further 
substituted by substituents. 

The term "substituents" refers to a group "substituted" on an alkyl, cycloalkyl, 
alkenyl, alkynyl, heterocyclyl, heterocycloalkenyl, cycloalkenyl, aryl, or heteroaryl group at 
any atom of that group. Suitable substituents include, without limitation, alkyl, alkenyl, 
10 alkynyl, alkoxy, halo, hydroxy, cyano, nitro, amino, S0 3 H, sulfate, phosphate, 

perfluoroalkyl, perfluoroalkoxy, methylenedioxy, ethylenedioxy, carboxyl, oxo, thioxo, 
imino (alkyl, aryl, aralkyl), S(0) n alkyl (where n is 0-2), S(0) n aryl (where n is 0-2), S(0) n 
heteroaryl (where n is 0-2), S(0) n heterocyclyl (where n is 0-2), amine (mono-, di-, alkyl, 
cycloalkyl, aralkyl, heteroaralkyl, and combinations thereof), ester (alkyl, aralkyl, 
15 heteroaralkyl), amide (mono-, di-, alkyl, aralkyl, heteroaralkyl, and combinations thereof), 
sulfonamide (mono-, di-, alkyl, aralkyl, heteroaralkyl, and combinations thereof), 
unsubstituted aryl, unsubstituted heteroaryl, unsubstituted heterocyclyl, and unsubstituted 
cycloalkyl. In one aspect, the substituents on a group are independently any one single, or 
any subset of the aforementioned substituents. 
20 The terms "adeninyl, cytosinyl, guaninyl, thyminyl, and uracilyl" and the like refer to 

radicals of adenine, cytosine, guanine, thymine, and uracil. 

As used herein, an "unusual" nucleobase can include any one of the following: 
2-methyladeninyl, 
N6-methyladeninyl, 
25 2-methylthio-N6-methyladeninyl, 
N6-isopentenyladeninyl, 
2-methylthio-N6-isopentenyladeninyl, 
N6-(cis-hydroxyisopentenyl)adeninyl, 
2-methylthio-N6-(cis-hydroxyisopentenyl) adeninyl, 
30 N6-glycinylcarbamoyladeninyl, 
N6-threonylcarbamoyladeninyl, 
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2-me1iiyltMo-N6-threonylcarbamoyladeninyl, 

N6-methyl-N6-threonylcarbamoyladeninyl, 

N6-hydroxynorvalylcarbamoyladeninyl, 

2- methyltbio-N6-hydroxynorvalylcarbamoyladeninyl, 
N6,N6-dimethyladeninyl, 

3- methylcytosinyl, 
5-methylcytosinyl, 
2-thiocytosinyl, 
5-formylcytosinyl, 



N4-methylcytosinyl, 

5 -hydroxymethylcytosinyl, 

1 -methylguaninyl, 

N2-methylguaninyl, 

7-methylguaninyl, 

N2,N2-dimethylguaninyl 3 



NH 
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N2,7-dimethylguaninyl, 
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N2,N2,7-1ximethylguaninyl, 

1- methylguaninyl, 
7-cyano-7-deazaguaninyl, 
7-aminomethyl-7-deazaguaninyl, 
pseudouracilyl, 
dihydrouracilyl, 
5-methyluracilyl, 

1 -methylpseudouracilyl, 

2- thiouracilyl, 

4- thiouracilyl, 

2- thiothyminyl 

5- methyl-2-thiouracilyl, 

3- (3-amino-3-carboxypropyl)uracilyl, 
5-hydroxyuracilyl, 
5-methoxyuracilyl, 

uracilyl 5-oxyacetic acid, 

uracilyl 5-oxyacetic acid methyl ester, 

5-(carboxyhydroxymethyl)uracilyI, 

5-(carboxyhydroxymethyl)uracilyl methyl ester, 

5-methoxycarbonylmethyluracilyl, 

5-methoxycarbonylmethyl-2-thiouracilyl, 

5-aminomethyl-2-thiouracilyl, 

5-methylaminomethyluracilyl, 

5-methylaminomethyl-2-thiouracilyl, 

5-methylaminomethyl-2-selenouracilyl, 

5-carbamoylmethyluracilyl, 

5-carboxymethylaminomethyluracilyl, 

5-carboxymethylaminomethyl-2-thiouracilyl, 

3-methyluracilyl, 

l-methyl-3-(3-amino-3-carboxypropyl) pseudouracilyl, 
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5-carboxymethyluracilyl, 
5-memyldihydrouracilyl, or 
3 -methylpseudouracilyl. 



5 Asymmetrical Modifications 

In one aspect, the invention features an iRNA agent which can be asymmetrically 
modified as described herein. 

In addition, the invention includes iRNA agents having asymmetrical modifications 
and another element described herein. E.g., the invention includes an iRNA agent described 

10 herein, e.g., a palindromic iRNA agent, an iRNA agent having a non canonical pairing, an 
iRNA agent which targets a gene described herein, e.g., a gene active in the liver, an iRNA 
agent having an architecture or structure described herein, an iRNA associated with an 
amphipathic delivery agent described herein, an iRNA associated with a drug delivery 
module described herein, an iRNA agent administered as described herein, or an iRNA agent 

15 formulated as described herein, which also incorporates an asymmetrical modification. 

iRNA agents of the invention can be asymmetrically modified. An asymmetrically 
modified iRNA agent is one in which a strand has a modification which is not present on the 
other strand. An asymmetrical modification is a modification found on one strand but not on 
the other strand. Any modification, e.g., any modification described herein, can be present as 

20 an asymmetrical modification. An asymmetrical modification can confer any of the desired 
properties associated with a modification, e.g., those properties discussed herein. E.g., an 
asymmetrical modification can: confer resistance to degradation, an alteration in half life; 
target the iRNA agent to a particular target, e.g., to a particular tissue; modulate, e.g., 
increase or decrease, the affinity of a strand for its complement or target sequence; or hinder 

25 or promote modification of a terminal moiety, e.g., modification by a kinase or other 

enzymes involved in the RISC mechanism pathway. The designation of a modification as 
having one property does not mean that it has no other property, e.g., a modification referred 
to as one which promotes stabilization might also enhance targeting. 

While not wishing to be bound by theory or any particular mechanistic model, it is 

30 believed that asymmetrical modification allows an iRNA agent to be optimized in view of the 
different or "asymmetrical" functions of the sense and antisense strands. For example, both 
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strands can be modified to increase nuclease resistance, however, since some changes can 
inhibit RISC activity, these changes can be chosen for the sense stand . In addition, since 
some modifications, e.g., targeting moieties, can add large bulky groups that, e.g., can 
interfere with the cleavage activity of the RISC complex, such modifications are preferably 

5 placed on the sense strand. Thus, targeting moieties, especially bulky ones (e.g. cholesterol), 
are preferentially added to the sense strand. In one embodiment, an asymmetrical 
modification in which a phosphate of the backbone is substituted with S, e.g., a 
phosphorothioate modification, is present in the antisense strand, and a 2' modification, e.g., 
2' OMe is present in the sense strand. A targeting moiety can be present at either (or both) 

10 the 5' or 3' end of the sense strand of the iRNA agent. In a preferred example, a P of the 

backbone is replaced with S in the antisense strand, 2'OMe is present in the sense strand, and 
a targeting moiety is added to either the 5* or 3' end of the sense strand of the iRNA agent. 

In a preferred embodiment an asymmetrically modified iRNA agent has a 
modification on the sense strand which modification is not found on the antisense strand and 

1 5 the antisense strand has a modification which is not found on the sense strand. 

Each strand can include one or more asymmetrical modifications. By way of 
example: one strand can include a first asymmetrical modification which confers a first 
property on the iRNA agent and the other strand can have a second asymmetrical 
modification which confers a second property on the iRNA. E.g., one strand, e.g., the sense 

20 strand can have a modification which targets the iRNA agent to a tissue, and the other strand, 
e.g., the antisense strand, has a modification which promotes hybridization with the target 
gene sequence. 

In some embodiments both strands can be modified to optimize the same property, 
e.g., to increase resistance to nucleolytic degradation, but different modifications are chosen 
25 for the sense and the antisense strands, e.g., because the modifications affect other properties 
as well. E.g., since some changes can affect RISC activity these modifications are chosen for 
the sense strand. 

In an embodiment one strand has an asymmetrical 2' modification, e.g., a 2' OMe 
modification, and the other strand has an asymmetrical modification of the phosphate 
30 backbone, e.g., a phosphorothioate modification. So, in one embodiment the antisense strand 
has an asymmetrical 2' OMe modification and the sense strand has an asymmetrical 
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phosphorothioate modification (or vice versa). In a particularly preferred embodiment the 
RNAi agent will have asymmetrical 2'-0 alkyl, preferably, 2'-OMe modifications on the 
sense strand and asymmetrical backbone P modification, preferably a phosphothioate 
modification in the antisense strand. There can be one or multiple 2'-OMe modifications, 

5 e.g., at least 2, 3, 4, 5, or 6, of the subunits of the sense strand can be so modified. There can 
be one or multiple phosphorothioate modifications, e.g., at least 2, 3, 4, 5, or 6, of the 
subunits of the antisense strand can be so modified. It is preferable to have an iRNA agent 
wherein there are multiple 2'-OMe modifications on the sense strand and multiple 
phophorothioate modifications on the antisense strand. All of the subunits on one or both 

10 strands can be so modified. A particularly preferred embodiment of multiple asymmetric 
modification on both strands has a duplex region about 20-21 , and preferably 19, subunits in 
length and one or two 3' overhangs of about 2 subunits in length. 

Asymmetrical modifications are useful for promoting resistance to degradation by 
nucleases, e.g., endonucleases. iRNA agents can include one or more asymmetrical 

15 modifications which promote resistance to degradation. In preferred embodiments the 
modification on the antisense strand is one which will not interfere with silencing of the 
target, e.g., one which will not interfere with cleavage of the target. Most if not all sites on a 
strand are vulnerable, to some degree, to degradation by endonucleases. One can determine 
sites which are relatively vulnerable and insert asymmetrical modifications which inhibit 

20 degradation. It is often desirable to provide asymmetrical modification of a UA site in an 
iRNA agent, and in some cases it is desirable to provide the UA sequence on both strands 
with asymmetrical modification. Examples of modifications which inhibit endonucleolytic 
degradation can be found herein. Particularly favored modifications include: 2' 
modification, e.g., provision of a 2' OMe moiety on the U, especially on a sense strand; 

25 modification of the backbone, e.g., with the replacement of an O with an S, in the phosphate 
backbone, e.g., the provision of a phosphorothioate modification, on the U or the A or both, 
especially on an antisense strand; replacement of the U with a C5 amino linker; replacement 
of the A with a G (sequence changes are preferred to be located on the sense strand and not 
the antisense strand); and modification of the at the 2', 6', 7', or 8' position. Preferred 

30 embodiments are those in which one or more of these modifications are present on the sense 
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but not the antisense strand, or embodiments where the antisense strand has fewer of such 
modifications. 

Asymmetrical modification can be used to inhibit degradation by exonucleases. 
Asymmetrical modifications can include those in which only one strand is modified as well 
5 as those in which both are modified. In preferred embodiments the modification on the 
antisense strand is one which will not interfere with silencing of the target, e.g., one which 
will not interfere with cleavage of the target. Some embodiments will have an asymmetrical 
modification on the sense strand, e.g., in a 3' overhang, e.g., at the 3' terminus, and on the 
antisense strand, e.g., in a 3' overhang, e.g., at the 3' terminus. If the modifications introduce 

10 moieties of different size it is preferable that the larger be on the sense strand. If the 

modifications introduce moieties of different charge it is preferable that the one with greater 
charge be on die sense strand. 

Examples of modifications which inhibit exonucleolytic degradation can be found 
herein. Particularly favored modifications include: 2' modification, e.g., provision of a 2' 

15 OMe moiety in a 3' overhang, e.g., at the 3' terminus (3' terminus means at the 3' atom of 
the molecule or at the most 3' moiety, e.g., the most 3' P or 2' position, as indicated by the 
context); modification of the backbone, e.g., with the replacement of a P with an S, e.g., the 
provision of a phosphorothioate modification, or the use of a methylated P in a 3' overhang, 
e.g., at the 3' terminus; combination of a 2' modification, e.g., provision of a 2' O Me 

20 moiety and modification of the backbone, e.g., with the replacement of a P with an S, e.g., 
the provision of a phosphorothioate modification, or the use of a methylated P, in a 3' 
overhang, e.g., at the 3' terminus; modification with a 3' alkyl; modification with an abasic 
pyrolidine in a 3' overhang, e.g., at the 3' terminus; modification with naproxene, ibuprofen, 
or other moieties which inhibit degradation at the 3' terminus. Preferred embodiments are 

25 those in which one or more of these modifications are present on the sense but not the 

antisense strand, or embodiments where the antisense strand has fewer of such modifications. 

Modifications, e.g., those described herein, which affect targeting can be provided as 
asymmetrical modifications. Targeting modifications which can inhibit silencing, e.g., by 
inhibiting cleavage of a target, can be provided as asymmetrical modifications of the sense 

30 strand. A biodistribution altering moiety, e.g., cholesterol, can be provided in one or more, 
e.g., two, asymmetrical modifications of the sense strand. Targeting modifications which 
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introduce moieties having a relatively large molecular weight, e.g., a molecular weight of 
more than 400, 500, or 1000 daltons, or which introduce a charged moiety (e.g., having more 
than one positive charge or one negative charge) can be placed on the sense strand. 
Modifications, e.g., those described herein, which modulate, e.g., increase or 
5 decrease, the affinity of a strand for its compliment or target, can be provided as 

asymmetrical modifications. These include: 5 methyl U; 5 methyl C; pseudouridine, Locked 
nucleic acids ,2 thio U and 2-amino-A. In some embodiments one or more of these is 
provided on the antisense strand. 

iRNA agents have a defined structure, with a sense strand and an antisense strand, 

1 o and in many cases short single strand overhangs, e.g., of 2 or 3 nucleotides are present at one 
or both 3' ends. Asymmetrical modification can be used to optimize the activity of such a 
structure, e.g., by being placed selectively within the iRNA. E.g., the end region of the iRNA 
agent defined by the 5' end of the sense strand and the 3'end of the antisense strand is 
important for function. This region can include the terminal 2, 3, or 4 paired nucleotides and 

15 any 3' overhang. In preferred embodiments asymmetrical modifications which result in one 
or more of the following are used: modifications of the 5' end of the sense strand which 
inhibit kinase activation of the sense strand, including, e.g., attachments of conjugates which 
target the molecule or the use modifications which protect against 5' exonucleolytic 
degradation; or modifications of either strand, but preferably the sense strand, which enhance 

20 binding between the sense and antisense strand and thereby promote a "tight" structure at this 
end of the molecule. 

The end region of the iRNA agent defined by the 3' end of the sense strand and the 
5 'end of the antisense strand is also important for function. This region can include the 
terminal 2, 3, or 4 paired nucleotides and any 3' overhang. Preferred embodiments include 

25 asymmetrical modifications of either strand, but preferably the sense strand, which decrease 
binding between the sense and antisense strand and thereby promote an "open" structure at 
this end of the molecule. Such modifications include placing conjugates which target the 
molecule or modifications which promote nuclease resistance on the sense strand in this 
region. Modification of the antisense strand which inhibit kinase activation are avoided in 

30 preferred embodiments. 
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Exemplary modifications for asymmetrical placement in the sense strand include the 
following: 

(a) backbone modifications, e.g., modification of a backbone P, including 
replacement of P with S, or P substituted with alkyl or allyl, e.g., Me, and dithioates (S-P=S); 

5 these modifications can be used to promote nuclease resistance; 

(b) 2'-0 alkyl, e.g., 2'-OMe, 3'-0 alkyl, e.g., 3'-OMe (at terminal and/or internal 
positions); these modifications can be used to promote nuclease resistance or to enhance 
binding of the sense to the antisense strand, the 3' modifications can be used at the 5' end of 
the sense strand to avoid sense strand activation by RISC; 

10 (c) T -5 ' linkages (with 2'-H, 2'-OH and 2'-OMe and with P=0 or P=S) these 

modifications can be used to promote nuclease resistance or to inhibit binding of the sense to 
the antisense strand, or can be used at the 5' end of the sense strand to avoid sense strand 
activation by RISC; 

(d) L sugars (e.g., L ribose, L-arabinose with 2'-H, 2 > -OH and 2'-OMe); these 

1 5 modifications can be used to promote nuclease resistance or to inhibit binding of the sense to 
the antisense strand, or can be used at the 5' end of the sense strand to avoid sense strand 
activation by RISC; 

(e) modified sugars (e.g., locked nucleic acids (LNA's), hexose nucleic acids 
(HNA's) and cyclohexene nucleic acids (CeNA's)); these modifications can be used to 

20 promote nuclease resistance or to inhibit binding of the sense to the antisense strand, or can 
be used at the 5' end of the sense strand to avoid sense strand activation by RISC; 

(f) nucleobase modifications (e.g., C-5 modified pyrimidines, N-2 modified purines, 
N-7 modified purines, N-6 modified purines), these modifications can be used to promote 
nuclease resistance or to enhance binding of the sense to the antisense strand; 

25 (g) cationic groups and Zwitterionic groups (preferably at a terminus), these 

modifications can be used to promote nuclease resistance; 

(h) conjugate groups (preferably at terminal positions), e,g., naproxen, biotin, 
cholesterol, ibuprofen, folic acid, peptides, and carbohydrates; these modifications can be 
used to promote nuclease resistance or to target the molecule, or can be used at the 5' end of 

30 the sense strand to avoid sense strand activation by RISC. 
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Exemplary modifications for asymmetrical placement in the antisense strand include 
the following: 

(a) backbone modifications, e.g., modification of a backbone P, including 
replacement of P with S, or P substituted with alkyl or allyl, e.g., Me, and dithioates (S-P=S); 

(b) 2'-0 alkyl, e.g., 2'-OMe, (at terminal positions); 

(c) 2'-5' linkages (with 2'-H, 2'-OH and 2'-OMe) e.g., terminal at the 3' end); e.g., 
with P=0 or P=S preferably at the 3'-end, these modifications are preferably excluded from 
the 5' end region as they may interfere with RISC enzyme activity such as kinase activity; 

(d) L sugars (e.g, L ribose, L-arabinose with 2'-H, 2'-OH and 2'-OMe); e.g., terminal 
at the 3' end; e.g., with P=0 or P=S preferably at the 3'-end, these modifications are 
preferably excluded from the 5' end region as they may interfere with kinase activity; 

(e) modified sugars (e.g., LNA's, HNA's and CeNA's); these modifications are 
preferably excluded from the 5' end region as they may contribute to unwanted 
enhancements of paring between the sense and antisense strands, it is often preferred to have 
a "loose" structure in the 5' region, additionally, they may interfere with kinase activity; 

(f) nucleobase modifications (e.g., C-5 modified pyrrolidines, N-2 modified purines, 
N-7 modified purines, N-6 modified purines); 

(g) cationic groups and Zwitterionic groups (preferably at a terminus); 

conjugate groups (preferably at terminal positions), e,g., naproxen, biotin, cholesterol, 
ibuprofen, folic acid, peptides, and carbohydrates, but bulky groups or generally groups 
which inhibit RISC activity should are less preferred. 

The 5'-OH of the antisense strand should be kept free to promote activity. In some 
preferred embodiments modifications that promote nuclease resistance should be included at 
the 3' end, particularly in the 3' overhang. 

In another aspect, the invention features a method of optimizing, e.g., stabilizing, an 
iRNA agent. The method includes selecting a sequence having activity, introducing one or 
more asymmetric modifications into the sequence, wherein the introduction of the 
asymmetric modification optimizes a property of the iRNA agent but does not result insa 
decrease in activity. 

The decrease in activity can be less than a preselected level of decrease. In 
preferred embodiments decrease in activity means a decrease of less than 5, 10, 20, 40, or 
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50 % activity, as compared with an otherwise similar iRNA lacking the introduced 
modification. Activity can, e.g., be measured in vivo, or in vitro, with a result in either being 
sufficient to demonstrate the required maintenance of activity. 

The optimized property can be any property described herein and in particular the 
5 properties discussed in the section on asymmetrical modifications provided herein. The 
modification can be any asymmetrical modification, e.g., an asymmetric modification 
described in the section on asymmetrical modifications described herein. Particularly 
preferred asymmetric modifications are 2'-0 alkyl modifications, e.g., 2'-OMe 
modifications, particularly in the sense sequence, and modifications of a backbone O, 

10 particularly phosphorothioate modifications, in the antisense sequence. 

In a preferred embodiment a sense sequence is selected and provided with an 
asymmetrical modification, while in other embodiments an antisense sequence is selected 
and provided with an asymmetrical modification. In some embodiments both sense and 
antisense sequences are selected and each provided with one or more asymmetrical 

15 modifications. 

Multiple asymmetric modifications can be introduced into either or both of the sense 
and antisense sequence. A sequence can have at least 2, 4, 6, 8, or more modifications and 
all or substantially all of the monomers of a sequence can be modified. 
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Table: 2. Some examples of Asymmetric Modification 

This table shows examples having strand I with a selected modification and strand II 
with a selected modification. 



Strand I 


Strand II 


Nuclease Resistance (e.g. 2'-OMe) 


Biodistribution (e.g., P=S) 


Biodistribution conjugate 
(e.g. Lipophile) 


Protein Binding Functionality 
(e.g. Naproxen) 


Tissue Distribution Functionality 
(e.g. Carbohydrates) 


Cell Targeting Functionality 
(e.g. Folate for cancer cells) 


Tissue Distribution Functionality 
(e.g. Liver Cell Targeting 
Carbohydrates) 


Fusogenic Functionality 
(e.g. Polyethylene imines) 


Cancer Cell Targeting 
(e. g. RGD peptides and imines) 


Fusogenic Functionality 
(e.g. peptides) 


Nuclease Resistance (e.g. 2'-OMe) 


Increase in binding Affinity (5-Me-C, 5-Me-U, 2- 
thio-U, 2-amino-A, G-clamp, LNA) 


Tissue Distribution Functionality 


RISC activity improving Functionality 


Helical conformation changing 
Functionalities 


Tissue Distribution Functionality 
(P=S; lipophile, carbohydrates) 
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Z-X-Y Architecture 

In one aspect, the invention features an iRNA agent which can have a Z-X-Y 
5 architecture or structure such as those described herein and those described in copending, co- 
owned United States Provisional Application Serial No. 60/510,246 (Attorney Docket No. 
14174-079P02), filed on October 9, 2003, which is hereby incorporated by reference, and in 
copending, co-owned United States Provisional Application Serial No. 60/510,318 (Attorney 
Docket No. 14174-079P03), filed on October 10, 2003, which is hereby incorporated by 
10 reference. 

In addition, the invention includes iRNA agents having a Z-X-Y structure and another 
element described herein. E.g., the invention includes an iRNA agent described herein, e.g., 
a palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA associated 
1 5 with an amphipathic delivery agent described herein, an iRNA associated with a drug 
delivery module described herein, an iRNA agent administered as described herein, or an 
iRNA agent formulated as described herein, which also incorporates a Z-X-Y architecture. 

The invention provides an iRNA agent having a first segment, the Z region, a second 
segment, the X region, and optionally a third region, the Y region: 

20 

Z— X— Y. 

It may be desirable to modify subunits in one or both of Zand/or Y on one hand and X 
on the other hand. In some cases they will have the same modification or the same class of 
25 modification but it will more often be the case that the modifications made in Z and/or Y will 
differ from those made in X. 

The Z region typically includes a terminus of an iRNA agent. The length of the Z 
region can vary, but will typically be from 2-14, more preferably 2-10, subunits in length. It 
typically is single stranded, i.e., it will not base pair with bases of another strand, though it 
30 may in some embodiments self associate, e.g., to form a loop structure. Such structures can 
be formed by the end of a strand looping back and forming an intrastrand duplex. E.g., 2, 3, 
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4, 5 or more intra-strand bases pairs can form, having a looped out or connecting region, 
typically of 2 or more subunits which do not pair. This can occur at one or both ends of a 
strand. A typical embodiment of a Z region is a single strand overhang, e.g., an over hang of 
the length described elsewhere herein. The Z region can thus be or include a 3' or 5' 
5 terminal single strand. It can be sense or antisense strand but if it is antisense it is preferred 
that it is a 3- overhang. Typical inter-subunit bonds in the Z region include: P=0; P=S; S- 
P=S; P-NR 2 ; and P-BR 2 . Chiral P=X, where X is S, N, or B) inter-subunit bonds can also be 
present. (These inter-subunit bonds are discussed in more detail elsewhere herein.) Other 
preferred Z region subunit modifications (also discussed elsewhere herein) can include: 3'- 

10 OR, 3'SR, 2'-OMe, 3'-OMe, and 2'OH modifications and moieties; alpha configuration 
bases; and 2' arabino modifications. 

The X region will in most cases be duplexed, in the case of a single strand iRNA 
agent, with a corresponding region of the single strand, or in the case of a double stranded 
iRNA agent, with the corresponding region of the other strand. The length of the X region 

1 5 can vary but will typically be between 1 0-45 and more preferably between 1 5 and 35 

subunits. Particularly preferred region X's will include 17, 18, 19, 29, 21, 22, 23, 24, or 25 
nucleotide pairs, though other suitable lengths are described elsewhere herein and can be 
used. Typical X region subunits include 2'-OH subunits. In typical embodiments phosphate 
inter-subunit bonds are preferred while phophorothioate or non-phosphate bonds are absent. 

20 Other modifications preferred in the X region include: modifications to improve binding, 
e.g., nucleobase modifications; cationic nucleobase modifications; and C-5 modified 
pyrimidines, e.g., allylamines. Some embodiments have 4 or more consecutive 2'OH 
subunits. While the use of phosphorothioate is sometimes non preferred they can be used if 
they connect less than 4 consecutive 2'OH subunits. 

25 The Y region will generally conform to the the parameters set out for the Z regions. 

However, the X and Z regions need not be the same, different types and numbers of 
modifications can be present, and infact, one will usually be a 3' overhang and one will 
usually be a 5' overhang. 

In a preferred embodiment the iRNA agent will have a Y and/or Z region each having 

30 ribonucleosides in which the 2'-OH is substituted, e.g., with 2'-OMe or other alkyl; and an X 
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region that includes at least four consecutive ribonucleoside subunits in which the 2' -OH 

remains unsubstituted. 

The subunit linkages (the linkages between subunits) of an iRNA agent can be 

modified, e.g., to promote resistance to degradation. Numerous examples of such 
5 modifications are disclosed herein, one example of which is the phosphorothioate linkage. 

These modifications can be provided bewteen the subunits of any of the regions, Y, X, and Z. 

However, it is preferred that their occureceis minimized and in particular it is preferred that 

consecutive modified linkages be avoided. 

In a preferred embodiment the iRNA agent will have a Y and Z region each having 
10 ribonucleosides in which the 2'-OH is substituted, e.g., with 2'-OMe; and an X region that 

includes at least four consecutive subunits, e.g., ribonucleoside subunits in which the 2'-OH 

remains unsubstituted. 

As mentioned above, the subunit linkages of an iRNA agent can be modified, e.g., to 

promote resistance to degradation. These modifications can be provided between the 
1 5 subunits of any of the regions, Y, X, and Z. However, it is preferred that they are minimized 

and in particular it is preferred that consecutive modified linkages be avoided. 

Thus, in a preferred embodiment, not all of the subunit linkages of the iRNA agent 

are modified and more preferably the maximum number of consecutive subunits linked by 

other than a phospodiester bond will be 2, 3, or 4. Particulary preferred iRNA agents will not. 
20 have four or more consecutive subunits, e.g., 2'-hydroxyl ribonucleoside subunits, in which 

each subunits is joined by modified linkages - i.e. linkages that have been modified to 

stabilize them from degradation as compared to the phosphodiester linkages that naturally 

occur in RNA and DNA. 

It is particularly preferred to minimize the occurrence in region X. Thus, in preferred 
25 embodiments each of the nucleoside subunit linkages in X will be phosphodiester linkages, 

or if subunit linkages in region X are modified, such modifications will be minimized. E.g., 

although the Y and/or Z regions can include inter subunit linkages which have been 

stabilized against degradation, such modifications will be minimized in the X region, and in 

particular consecutive modifications will be minimized. Thus, in preferred embodiments the 
30 maximum number of consecutive subunits linked by other than a phospodiester bond will be 

2, 3, or 4. Particulary preferred X regions will not have four or more consecutive subunits, 
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e.g., 2'-hydroxyl ribonucleoside subunits, in which each submits is joined by modified 
linkages - i.e. linkages that have been modified to stabilize them from degradation as 
compared to the phosphodiester linkages that naturally occur in RNA and DNA. 

In a preferred embodiment Y and /or Z will be free of phosphorothioate linkages, 
though either or both may contain other modifications, e.g., other modifications of the 
subunit linkages. 

In a preferred embodiment region X, or in some cases, the entire iRNA agent, has no 
more than 3 or no more than 4 subunits having identical T moieties. 

In a preferred embodiment region X, or in some cases, the entire iRNA agent, has no 
more than 3 or no more than 4 subunits having identical subunit linkages. 

In a preferred embodiment one or more phosphorothioate linkages (or other 
modifications of the subunit linkage) are present in Y and/or Z, but such modified linkages 
do not connect two adjacent subunits, e.g., nucleosides, having a 2' modification, e.g., a T- 
O-alkyl moiety. E.g., any adjacent 2'-0-alkyl moieties in the Y and/or Z, are connected by a 
linkage other than a a phosphorothioate linkage. 

In a preferred embodiment each of Y and/or Z independently has only one 
phosphorothioate linkage between adjacent subunits, e.g., nucleosides, having a 2' 
modification, e.g., 2'-0-alkyl nucleosides. If there is a second set of adjacent subunits, e.g., 
nucleosides, having a T modification, e.g., 2'-0-alkyl nucleosides, in Y and/or Z that 
second set is connected by a linkage other than a phosphorothioate linkage, e.g., a modified 
linkage other than a phosphorothioate linkage. 

In a prefered embodiment each of Y and/orZ independently has more than one 
phosphorothioate linkage connecting adjacent pairs of subunits, e.g., nucleosides, having a 2' 
modification, e.g., 2'-0-alkyl nucleosides, but at least one pair of adjacent subunits, e.g., 
nucleosides, having a 2' modification, e.g., 2'-0-alkyl nucleosides, are be connected by a 
linkage other than a phosphorothioate linkage, e.g., a modified linkage other than a 
phosphorothioate linkage. 

In a prefered embodiment one of the above recited limitation on adjacent subunits in 
Y and or Z is combined with a limitation on the subunits in X. E.g., one or more 
phosphorothioate linkages (or other modifications of the subunit linkage) are present in Y 
and/or Z, but such modified linkages do not connect two adjacent subunits, e.g., nucleosides, 
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having a 2' modification, e.g., a 2'-0-alkyl moiety. E.g., any adjacent 2'-0-aIkyl moieties in 
the Y and/or Z, are connected by a linkage other than a a phosporothioate linkage. In 
addition, the X region has no more than 3 or no more than 4 identical subunits, e.g., subunits 
having identical 2' moieties or the X region has no more than 3 or no more than 4 subunits 
5 having identical subunit linkages. 

A Y and/or Z region can include at least one, and preferably 2, 3 or 4 of a 
modification disclosed herein. Such modifications can be chosen, independently, from any 
modification described herein, e.g., from nuclease resistant subunits, subunits with modified 
bases, subunits with modified intersubunit linkages, subunits with modified sugars, and 

10 subunits linked to another moiety, e.g., a targeting moiety. In a preferred embodiment more 
than 1 of such subunits can be present but in some emobodiments it is prefered that no more 
than 1, 2, 3, or 4 of such modifications occur, or occur consecutively, hi a preferred 
embodiment the frequency of the modification will differ between Yand /or Z and X, e.g., the 
modification will be present one of Y and/or Z or X and absent in the other. 

15 An X region can include at least one, and preferably 2, 3 or 4 of a modification 

disclosed herein. Such modifications can be chosen, independently, from any modification 
desribed herein, e.g., from nuclease resistant subunits, subunits with modified bases, subunits 
with modified intersubunit linkages, subunits with' modified sugars, and subunits linked to 
another moiety, e.g., a targeting moiety. In a preferred embodiment more than 1 of such 

20 subunits can b present but in some emobodiments it is prefered that no more than 1 , 2, 3 , or 4 
of such modifications occur, or occur consecutively. 

An KRMS (described elswhere herein) can be introduced at one or more points in one 
or both strands of a double-stranded iRNA agent. An RRMS can be placed in a Y and/or Z 
region, at or near (within 1, 2, or 3 positions) of the 3' or 5' end of the sense strand or at near 

25 (within 2 or 3 positions of) the 3' end of the antisense strand. In some embodiments it is 
preferred to not have an RRMS at or near (within 1, 2, or 3 positions of) the 5' end of the 
antisense strand. An RRMS can be positioned in the X region, and will preferably be 
positioned in the sense strand or in an area of the antisense strand not critical for antisense 
binding to the target. 
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Differential Modification of Terminal Duplex Stability 

In one aspect, the invention features an iRNA agent which can have differential 
modification of terminal duplex stability (DMTDS). 

In addition, the invention includes iRNA agents having DMTDS and another element 
described herein. E.g., the invention includes an iRNA agent described herein, e.g., a 
palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent having 
an architecture or structure described herein, an iRNA associated with an amphipathic 
delivery agent described herein, an iRNA associated with a drug delivery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates DMTDS. 

iRNA agents can be optimized by increasing the propensity of the duplex to 
disassociate or melt (decreasing the free energy of duplex association), in the region of the 5' 
end of the antisense strand duplex. This can be accomplished, e.g., by the inclusion of 
subunits which increase the propensity of the duplex to disassociate or melt in the region of 
the 5' end of the antisense strand. It can also be accomplished by the attachment of a ligand 
that increases the propensity of the duplex to disassociate of melt in the region of the 5 'end . 
While not wishing to be bound by theory, the effect may be due to promoting the effect of an 
enzyme such as helicase, for example, promoting the effect of the enzyme in the proximity of 
the 5' end of the antisense strand. 

The inventors have also discovered that iRNA agents can be optimized by decreasing 
the propensity of the duplex to disassociate or melt (increasing the free energy of duplex 
association), in the region of the 3' end of the antisense strand duplex. This can be 
accomplished, e.g., by the inclusion of subunits which decrease the propensity of the duplex 
to disassociate or melt in the region of the 3' end of the antisense strand. It can also be 
accomplished by the attachment of ligand that decreases the propensity of the duplex to 
disassociate of melt in the region of the 5 'end. 

Modifications which increase the tendency of the 5' end of the duplex to dissociate 
can be used alone or in combination with other modifications described herein, e.g., with 
modifications which decrease the tendency of the 3' end of the duplex to dissociate. 
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Likewise, modifications which decrease the tendency of the 3' end of the duplex to dissociate 
can be used alone or in combination with other modifications described herein, e.g., with 
modifications which increase the tendency of the 5' end of the duplex to dissociate. 

Decreasing the stability of the AS 5 ' end of the duplex 

Subunit pahs can be ranked on the basis of their propensity to promote dissociation or 
melting (e.g., on the free energy of association or dissociation of a particular pairing, the 
simplest approach is to examine the pairs on an individual pair basis, though next neighbor or 
similar analysis can also be used). In terms of promoting dissociation: 

A:U is preferred over G:C; 
G:U is preferred over G:C; 
I:C is preferred over G:C (I=inosine); 

mismatches, e.g., non-canonical or other than canonical pairings (as described 
elsewhere herein) are preferred over canonical (A:T, A:U, G:C) pairings; 

pairings which include a universal base are preferred over canonical pairings. 

A typical ds iRNA agent can be diagrammed as follows: 

S 5' R 1 NiN 2 N 3 N4N 5 [N] N. 5 N. 3 N. 2 N-i R 2 3' 
AS 3' R 3 NiN 2 N 3 N4N 5 [Nj N_ 5 N. 4 N. 3 N. 2 N-i R 4 5' 

S:AS P, P 2 P 3 P 4 Ps [N] P-5P.4P-3P.2P-1 5' 

S indicates the sense strand; AS indicates antisense strand; Rj indicates an optional 
(and nonpreferred) 5' sense strand overhang; R 2 indicates an optional (though preferred) 3 5 
sense overhang; R 3 indicates an optional (though preferred) 3' antisense sense overhang; Rt 
indicates an optional (and nonpreferred) 5' antisense overhang; N indicates subunits; [N] 
indicates that additional subunit pairs may be present; and P x , indicates a paring of sense N x 
and antisense N x . Overhangs are not shown in the P diagram. In some embodiments a 3' AS 
i overhang corresponds to region Z, the duplex region corresponds to region X, and the 3 ' S 
strand overhang corresponds to region Y, as described elsewhere herein. (The diagram is not 
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meant to imply maximum or minimum lengths, on which guidance is provided elsewhere 
herein.) 

It is preferred that pairings which decrease the propensity to form a duplex are used at 
1 or more of the positions in the duplex at the 5' end of the AS strand. The terminal pair (the 

5 most 5 5 pair in terms of the AS strand) is designated as P.i , and the subsequent pairing 

positions (going in the 3' direction in terms of the AS strand) in the duplex are designated, P. 
2 , P-3, P-4, P-5, and so on. The preferred region in which to modify to modulate duplex 
formation is at P. 5 through P. b more preferably P. 4 through P_i , more preferably P. 3 through 
P_i. Modification at P. b is particularly preferred, alone or with modification(s) other 

10 position(s), e.g., any of the positions just identified. It is preferred that at least 1, and more 
preferably 2, 3, 4, or 5 of the pairs of one of the recited regions be chosen independently 
from the group of: 

A:U 

15 G:U 
I:C 

mismatched pairs, e.g., non-canonical or other than canonical pairings or pairings 
which include a universal base. 

In preferred embodiments the change in subunit needed to achieve a pairing which 
20 promotes dissociation will be made in the sense strand, though in some embodiments the 
change will be made in the antisense strand. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P. 4 , are pairs 
which promote disociation. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P.4, are A:U. 
25 In a preferred embodiment the at least 2, or 3, of the pairs in P.j, through P^, are G:U. 

In a preferred embodiment the at least 2, or 3, of the pairs in P-i, through P.4, are I:C. 
In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P. 4 , are 
mismatched pairs, e.g., non-canonical or other than canonical pairings pairings. 

In a preferred embodiment the at least 2, or 3, of the pairs in P-i, through P. 4 , are 
30 pairings which include a universal base. 
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Increasing the stability of the AS 3 ' end of the duplex 

Subunit pairs can be ranked on the basis of their propensity to promote stability and 
inhibit dissociation or melting (e.g., on the tree energy of association or dissociation of a 
particular pairing, the simplest approach is to examine the pairs on an individual pair basis, 
though next neighbor or similar analysis can also be used). In terms of promoting duplex 
stability: 

G:C is preferred over A:U 

Watson-Crick matches (A:T, A:U, G:C) are preferred over non-canonical or other 
than canonical pairings 

analogs that increase stability are preferred over Watson-Crick matches (A:T, A:U, 

G:C) 

2-amino-A:U is preferred over A:U 
2-thio U or 5 Me-thio-U:A are preferred over U:A 

G-clamp (an analog of C having 4 hydrogen bonds):G is preferred over C:G 
guanadinium-G-clamp:G is preferred over C:G 
psuedo uridine A is preferred over U:A 

sugar modifications, e.g., 2' modifications, e.g., 2'F, ENA, or LNA, which enhance 
binding are preferred over non-modified moieties and can be present on one or both strands 
to enhance stability of the duplex. It is preferred that pairings which increase the propensity 
to form a duplex are used at 1 or more of the positions in the duplex at the 3' end of the AS 
strand. The terminal pair (the most 3' pair in terms of the AS strand) is designated as Pj, and 
the subsequent pairing positions (going in the 5' direction in terms of the AS strand) in the 
duplex are designated, P 2 , P 3 , P 4 , Ps, and so on. The preferred region in which to modify to 
modulate duplex formation is at P 5 through P,, more preferably P 4 through P, , more 
preferably P 3 through Pi. Modification at P,, is particularly preferred, alone or with 
mdification(s) at other position(s), e.g.,any of the positions just identified. It is preferred that 
at least 1, and more preferably 2, 3, 4, or 5 of the pairs of the recited regions be chosen 
independently from the group of: 

G:C 
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a pair having an analog that increases stability over Watson-Crick matches (A:T, 
A:U, G:C) 

2-amino-A:U 
2-thio U or 5 Me-thio-U:A 

G-clamp (an analog of C having 4 hydrogen bonds):G 
guanadinium-G-clamp : G 
psuedo uridine A 

a pair in which one or both subunits has a sugar modification, e.g., a 2' 
modification, e.g., 2'F, ENA, or LNA, which enhance binding. 

In a preferred embodiment the at least 2, or 3, of the pairs in P. 1} through P. 4 , are pairs 
which promote duplex stability. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P 4 , are G:C. 

In a preferred embodiment the at least 2, or 3, of the pairs in P h through P 4 , are a pair 
having an analog that increases stability over Watson-Crick matches. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P 4 , are 2- 
amino-A:U. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P 4 , are 2-thio 
U or 5 Me-thio-U:A. 

In a preferred embodiment the at least 2, or 3, of the pairs in P h through P 4 , are G- 
clamp:G. 

In a preferred embodiment the at least 2, or 3, of the pairs in P b through P 4 , are 
guanidinium-G-clamp:G. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P 4 , are 
psuedo uridine: A. 

In a preferred embodiment the at least 2, or 3, of the pairs in P h through P 4 , are a pair 
in which one or both subunits has a sugar modification, e.g., a 2' modification, e.g., 2'F, 
ENA, or LNA, which enhances binding. 

G-clamps and guanidinium G-clamps are discussed in the following references: 
Holmes and Gait, "The Synthesis of 2'-0-Methyl G-Clamp Containing Oligonucleotides and 
Their Inhibition of the HIV-1 Tat-TAR Interaction," Nucleosides, Nucleotides & Nucleic 
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Acids, 22:1259-1262, 2003; Holmes et al, "Steric inhibition of human immunodeficiency 
virus type-1 Tat-dependent trans-activation in vitro and in cells by oligonucleotides 
containing 2 , -0-methyl G-clamp ribonucleoside analogues," Nucleic Acids Research, 
3 1 -.2759-2768, 2003; Wilds, et al, "Structural basis for recognition of guanosine by a 

5 synthetic tricyclic cytosine analogue: Guanidinium G-clamp," Helvetica Chimica Acta, 
86:966-978, 2003; Rajeev, et al, "High-Affinity Peptide Nucleic Acid Oligomers 
Containing Tricyclic Cytosine Analogues," Organic Letters, 4:4395-4398, 2002; Ausin, et 
al, "Synthesis of Amino- and Guanidino-G-Clamp PNA Monomers," Organic Letters, 
4:4073-4075, 2002; Maier et al, "Nuclease resistance of oligonucleotides containing the 

10 tricyclic cytosine analogues phenoxazine and 9-(2-aminoethoxy)-phenoxazine ("G-clamp") 
and origins of their nuclease resistance properties," Biochemistry, 41:1323-7, 2002; 
Flanagan, et al, "A cytosine analog that confers enhanced potency to antisense 
oligonucleotides," Proceedings Of The National Academy Of Sciences Of The United States 
Of America, 96:3513-8, 1999. 

15 

Simultaneously decreasinR Mj^^ 
the stability of the AS 3' end of t he duplex 

As is discussed above, an lRNA agent can be modified to both decrease the stability 
20 of the AS 5'end of the duplex and increase the stability of the AS 3' end of the duplex. This 
can be effected by combining one or more of the stability decreasing modifications in the AS 
5' end of the duplex with one or more of the stability increasing modifications in the AS 3' 
end of the duplex. Accordingly a preferred embodiment includes modification in P_ 5 through 
P.,, more preferably ? A through P.i and more preferably P. 3 through P.,. Modification at P. 1; 
25 is particularly preferred, alone or with other position, e.g., the positions just identified. It is 
preferred that at least 1, and more preferably 2, 3, 4, or 5 of the pairs of one of the recited 
regions of the AS 5' end of the duplex region be chosen independently from the group of: 



A:U 
G:U 
I:C 
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mismatched pairs, e.g., non-canonical or other than canonical pairings which 
include a universal base; and 

a modification in P 5 through P u more preferably P 4 through Pi and more preferably 
P 3 through Pi. Modification at Pi, is particularly preferred, alone or with other position, e.g., 
the positions just identified. It is preferred that at least 1, and more preferably 2, 3, 4, or 5 of 
the pairs of one of the recited regions of the AS 3' end of the duplex region be chosen 
independently from the group of: 

G:C 

a pair having an analog that increases stability over Watson-Crick matches (A:T, 
A:U, G:C) 

2-amino-A:U 
2-thio U or 5 Me-thio-U:A 

G-clamp (an analog of C having 4 hydrogen bonds):G 

guanadinium-G-clamp:G 

psuedo uridine: A 

a pair in which one or both subunits has a sugar modification, e.g., a 2' 
modification, e.g., 2'F, ENA, or LNA, which enhance binding. 

The invention also includes methods of selecting and making iRNA agents having 
DMTDS. E.g., when screening a target sequence for candidate sequences for use as iRNA 
agents one can select sequences having a DMTDS property described herein or one which 
can be modified, preferably with as few changes as possible, especially to the 

AS strand, to provide a desired level of DMTDS. 

The invention also includes, providing a candidate iRNA agent sequence, and 
modifying at least one P in P. 5 through P., and/or at least one P in P 5 through P, to provide a 
DMTDS iRNA agent. 

DMTDS iRNA agents can be used in any method described herein, e.g., to silence 
any gene disclosed herein, to treat any disorder described herein, in any formulation 
described herein, and generally in and/or with the methods and compositions described 
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elsewhere herein. DMTDS iRNA agents can incorporate other modifications described 
herein, e.g., the attachment of targeting agents or the inclusion of modifications which 
enhance stability, e.g., the inclusion of nuclease resistant monomers or the inclusion of single 
strand overhangs (e.g., 3 5 AS overhangs and/or 3' S strand overhangs) which self associate to 
form intrastrand duplex structure. 

Preferably these iRNA agents will have an architecture described herein. 

Other Embodiments 

In vivo Delivery 

An iRNA agent can be linked, e.g., noncovalently linked to a polymer for the efficient 
delivery of the iRNA agent to a subject, e.g., a mammal, such as a human. The iRNA agent 
can, for example, be complexed with cyclodextrin. Cyclodextrins have been used as delivery 
vehicles of therapeutic compounds. Cyclodextrins can form inclusion complexes with drugs 
that are able to fit into the hydrophobic cavity of the cyclodextrin. In other examples, 
cyclodextrins form non-covalent associations with other biologically active molecules such 
as oligonucleotides and derivatives thereof. The use of cyclodextrins creates a water-soluble 
drug delivery complex, that can be modified with targeting or other functional groups. 
Cyclodextrin cellular delivery system for oligonucleotides described in U.S. Pat. No. 
5,691,316, which is hereby incorporated by reference, are suitable for use in methods of the 
invention. In this system, an oligonucleotide is noncovalently complexed with a 
cyclodextrin, or the oligonucleotide is covalently bound to adamantine which in turn is non- 
covalently associated with a cyclodextrin. 

The delivery molecule can include a linear cyclodextrin copolymer or a linear 
oxidized cyclodextrin copolymer having at least one ligand bound to the cyclodextrin 
copolymer. Delivery systems , as described in U.S. Patent No. 6,509,323, herein 
incorporated by reference, are suitable for use in methods of the invention. An iRNA agent 
can be bound to the linear cyclodextrin copolymer and/or a linear oxidized cyclodextrin 
copolymer. Either or both of the cyclodextrin or oxidized cyclodextrin copolymers can be 
crosslinked to another polymer and/or bound to a ligand. 

A composition for iRNA delivery can employ an "inclusion complex," a molecular 
compound having the characteristic structure of an adduct. In this structure, the "host 
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molecule" spatially encloses at least part of another compound in the delivery vehicle. The 
enclosed compound (the "guest molecule") is situated in the cavity of the host molecule 
without affecting the framework structure of the host. A "host" is preferably cyclodextrin, 
but can be any of the molecules suggested in U.S. Patent Publ. 2003/0008818, herein 
incorporated by reference. 

Cyclodextrins can interact with a variety of ionic and molecular species, and the 
resulting inclusion compounds belong to the class of "host-guest" complexes. Within the 
host-guest relationship, the binding sites of the host and guest molecules should be 
complementary in the stereoelectronic sense. A composition of the invention can contain at 
least one polymer and at least one therapeutic agent, generally in the form of a particulate 
composite of the polymer and therapeutic agent, e.g., the iRNA agent. The iRNA agent can 
contain one or more complexing agents. At least one polymer of the particulate composite 
can interact with the complexing agent in a host-guest or a guest-host interaction to form an 
inclusion complex between the polymer and the complexing agent. The polymer and, more 
particularly, the complexing agent can be used to introduce functionality into the 
composition. For example, at least one polymer of the particulate composite has host 
functionality and forms an inclusion complex with a complexing agent having guest 
functionality. Alternatively, at least one polymer of the particulate composite has guest 
functionality and forms an inclusion complex with a complexing agent having host 
functionality. A polymer of the particulate composite can also contain both host and guest 
functionalities and form inclusion complexes with guest complexing agents and host 
complexing agents. A polymer with functionality can, for example, facilitate cell targeting 
and/or cell contact {e.g., targeting or contact to a liver cell), intercellular trafficking, and/or 
cell entry and release. 

Upon forming the particulate composite, the iRNA agent may or may not retain its 
biological or therapeutic activity. Upon release from the therapeutic composition, 
specifically, from the polymer of the particulate composite, the activity of the iRNA agent is 
restored. Accordingly, the particulate composite advantageously affords the iRNA agent 
protection against loss of activity due to, for example, degradation and offers enhanced 
bioavailability. Thus, a composition may be used to provide stability, particularly storage or 
solution stability, to an iRNA agent or any active chemical compound. The iRNA agent may 
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be further modified with a ligand prior to or after particulate composite or therapeutic 
composition formation. The ligand can provide further functionality. For example, the 
ligand can be a targeting moiety. 

5 Physiological Effects 

The iRNA agents described herein can be designed such that determining therapeutic 
toxicity is made easier by the complementarity of the iRNA agent with both a human and a 
non-human animal sequence. By these methods, an iRNA agent can consist of a sequence 
that is fully complementary to a nucleic acid sequence from a human and a nucleic acid 

10 sequence from at least one non-human animal, e.g., a non-human mammal, such as a rodent, 
ruminant or primate. For example, the non-human mammal can be a mouse, rat, dog, pig, 
goat, sheep, cow, monkey, Pan paniscus, Pan troglodytes, Macaca mulatto, or Cynomolgus 
monkey. The sequence of the iRNA agent could be complementary to sequences within 
homologous genes, e.g., oncogenes or tumor suppressor genes, of the non-human mammal 

15 and the human. By determining the toxicity of the iRNA agent in the non-human mammal, 
one can extrapolate the toxicity of the iRNA agent in a human. For a more strenuous toxicity 
test, the iRNA agent can be complementary to a human and more than one, e.g., two or three 
or more, non-human animals. 
The methods described herein can be used to correlate any physiological effect of an iRNA 

20 agent on a human, e.g., any unwanted effect, such as a toxic effect, or any positive, or desired 
effect. 

Delivery Module 

In one aspect, the invention features a drug delivery conjugate or module, such as 
25 those described herein and those described in copending, co-owned United States Provisional 
Application Serial No. 60/454,265, filed on March 12, 2003, which is hereby incorporated by 
reference. 

In addition, the invention includes iRNA agents described herein, e.g., a palindromic 
iRNA agent, an iRNA agent hving a non canonical pairing, an iRNA agent which targets a 
30 gene described herein, e.g., a gene active in the liver, an iRNA agent having a chemical 

modification described herein, e.g., a modification which enhances resistance to degradation, 
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an iRNA agent having an architecture or structure described herein, an iRNA agent 
administered as described herein, or an iRNA agent formulated as described herein, 
combined with, associated with, and delivered by such a drug delivery conjugate or module. 
The iRNA agents can be complexed to a delivery agent that features a modular 

5 complex. The complex can include a carrier agent linked to one or more of (preferably two 
or more, more preferably all three of): (a) a condensing agent (.e.g., an agent capable of 
attracting, e.g., binding, a nucleic acid, e.g., through ionic or electrostatic interactions); (b) a 
fusogenic agent (e.g., an agent capable of fusing and/or being transported through a cell 
membrane, e.g., an endosome membrane); and (c) a targeting group, e.g., a cell or tissue 

1 o targeting agent, e.g., a lectin, glycoprotein, lipid or protein, e.g., an antibody, that binds to a 
specified cell type such as a cancer cell, endothelial cell or bone cell. 

An iRNA agent, e.g., iRNA agent or sRNA agent described herein, can be linked, 
e.g., coupled or bound, to the modular complex. The iRNA agent can interact with the 
condensing agent of the complex, and the complex can be used to deliver an iRNA agent to a 

1 5 cell, e.g., in vitro or in vivo. For example, the complex can be used to deliver an iRNA agent 
to a subject in need thereof, e.g., to deliver an iRNA agent to a subject having a disorder, e.g., 
a disorder described herein, such as a disease or disorder of the liver. 

The fusogenic agent and the condensing agent can be different agents or the one and 
the same agent. For example, a polyamino chain, e.g., polyethyleneimine (PEI), can be the 

20 fusogenic and/or the condensing agent. 

The delivery agent can be a modular complex. For example, the complex can include 
a carrier agent linked to one or more of (preferably two or more, more preferably all three 
of): 

(a) a condensing agent (e.g., an agent capable of attracting, e.g., binding, a nucleic 
25 acid, e.g., through ionic interaction), 

(b) a fusogenic agent (e.g., an agent capable of fusing and/or being transported 
through a cell membrane, e.g., an endosome membrane), and 

(c) a targeting group, e.g., a cell or tissue targeting agent, e.g., a lectin, glycoprotein, 
lipid or protein, e.g., an antibody, that binds to a specified cell type such as a cancer cell, 

30 endothelial cell, bone cell. A targeting group can be a thyrotropin, melanotropin, lectin, 
glycoprotein, surfactant protein A, Mucin carbohydrate, multivalent lactose, multivalent 
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galactose, N-acetyl-galactosamine, N-acetyl-gulucosamine multivalent mannose, multivalent 
fucose, glycosylated polyaminoacids, multivalent galactose, transferrin, bisphosphonate, 
polyglutamate, polyaspartate, a lipid, cholesterol, a steroid, bile acid, folate, vitamin B 12, 
biotin, Neproxin, or an RGD peptide or RGD peptide mimetic. 

Carrier agents 

The carrier agent of a modular complex described herein can be a substrate for 
attachment of one or more of: a condensing agent, a fusogenic agent, and a targeting group. 
The carrier agent would preferably lack an endogenous enzymatic activity. The agent would 
preferably be a biological molecule, preferably a macromolecule. Polymeric biological 
carriers are preferred. It would also be preferred that the carrier molecule be biodegradable.. 

The carrier agent can be a naturally occurring substance, such as a protein (e.g., 
human serum albumin (HSA), low-density lipoprotein (LDL), or globulin); carbohydrate 
(e.g., a dextran, pullulan, chitin, chitosan, inulin, cyclodextrin or hyaluronic acid); or lipid. 
The earner molecule can also be a recombinant or synthetic molecule, such as a synthetic 
polymer, e.g., a synthetic polyamino acid. Examples of polyamino acids include polylysine 
(PLL), poly L-aspartic acid, poly L-glutamic acid, styrene-maleic acid anhydride copolymer, 
poly(L-lactide-co-glycolied) copolymer, divinyl ether-maleic anhydride copolymer, N-(2- 
hydroxypropyl)methacrylamide copolymer (HMPA), polyethylene glycol (PEG), polyvinyl 
alcohol (PVA), polyurethane, poly(2-ethylacryllic acid), N-isopropylacrylamide polymers, or 
polyphosphazine. Other useful carrier molecules can be identified by routine methods. 

A carrier agent can be characterized by one or more of: (a) is at least 1 Da in size; (b) 
has at least 5 charged groups, preferably between 5 and 5000 charged groups; (c) is present 
in the complex at a ratio of at least 1:1 carrier agent to fusogenic agent; (d) is present in the 
complex at a ratio of at least 1:1 earner agent to condensing agent; (e) is present in the 
complex at a ratio of at least 1 : 1 carrier agent to targeting agent. 

Fusogenic agents 

A fusogenic agent of a modular complex described herein can be an agent that is 
responsive to, e.g., changes charge depending on, the pH environment. Upon encountering 
the pH of an endosome, it can cause a physical change, e.g., a change in osmotic properties 
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which disrupts or increases the permeability of the endosome membrane. Preferably, the 
fusogenic agent changes charge, e.g., becomes protonated, atpH lower than physiological 
range. For example, the fusogenic agent can become protonated at pH 4.5-6.5. The 
fusogenic agent can serve to release the iRNA agent into the cytoplasm of a cell after the 
complex is taken up, e.g., via endocytosis, by the cell, thereby increasing the cellular 
concentration of the iRNA agent in the cell. 

In one embodiment, the fusogenic agent can have a moiety, e.g., an amino group, 
which, when exposed to a specified pH range, will undergo a change, e.g., in charge, e.g., 
protonation. The change in charge of the fusogenic agent can trigger a change, e.g., an 
osmotic change, in a vesicle, e.g., an endocytic vesicle, e.g., an endosome. For example, the 
fusogenic agent, upon being exposed to the pH environment of an endosome, will cause a 
solubility or osmotic change substantial enough to increase the porosity of (preferably, to 
rupture) the endosomal membrane. 

The fusogenic agent can be a polymer, preferably a polyamino chain, e.g., 
polyethyleneimine (PEI). The PEI can be linear, branched, synthetic or natural. The PEI can 
be, e.g., alkyl substituted PEI, or lipid substituted PEI. 

In other embodiments, the fusogenic agent can be polyhistidine, polyimidazole, 
polypyridine, polypropyleneimine, mellitin, or a polyacetal substance, e.g., a cationic 
polyacetal. In some embodiment, the fusogenic agent can have an alpha helical structure. 
The fusogenic agent can be a membrane disruptive agent, e.g., mellittin. 

A fusogenic agent can have one or more of the following characteristics: (a) is at least 
IDa in size; (b) has at least 10 charged groups, preferably between 10 and 5000 charged 
groups, more preferably between 50 and 1000 charged groups; (c) is present in the complex 
at a ratio of at least 1 : 1 fusogenic agent to carrier agent; (d) is present in the complex at a 
ratio of at least 1:1 fusogenic agent to condensing agent; (e) is present in the complex at a 
ratio of at least 1:1 fusogenic agent to targeting agent. 

Other suitable fusogenic agents can be tested and identified by a skilled artisan. The 
ability of a compound to respond to, e.g., change charge depending on, the pH environment 
can be tested by routine methods, e.g., in a cellular assay. For example, a test compound is 
combined or contacted with a cell, and the cell is allowed to take up the test compound, e.g., 
by endocytosis. An endosome preparation can then be made from the contacted cells and the 
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endosome preparation compared to an endosome preparation from control cells. A change, 
e.g., a decrease, in the endosome traction from the contacted cell vs. the control cell indicates 
that the test compound can function as a fusogenic agent. Alternatively, the contacted cell 
and control cell can be evaluated, e.g., by microscopy, e.g., by light or electron microscopy, 
to determine a difference in endosome population in the cells. The test compound can be 
labeled. In another type of assay, a modular complex described herein is constructed using 
one or more test or putative fusogenic agents. The modular complex can be constructed 
using a labeled nucleic acid instead of the iRNA. The ability of the fusogenic agent to 
respond to, e.g., change charge depending on, the pH environment, once the modular 
complex is taken up by the cell, can be evaluated, e.g., by preparation of an endosome 
preparation, or by microscopy techniques, as described above. A two-step assay can also be 
performed, wherein a first assay evaluates the ability of a test compound alone to respond to, 
e.g., change charge depending on, the pH environment; and a second assay evaluates the 
ability of a modular complex that includes the test compound to respond to, e.g., change 
charge depending on, the pH environment. 

Condensing agent 

The condensing agent of a modular complex described herein can interact with (e.g., 
attracts, holds, or binds to) an iRNA agent and act to (a) condense, e.g., reduce the size or 
charge of the iRNA agent and/or (b) protect the iRNA agent, e.g., protect the iRNA agent 
against degradation. The condensing agent can include a moiety, e.g., a charged moiety, that 
can interact with a nucleic acid, e.g., an iRNA agent, e.g., by ionic interactions. The 
condensing agent would preferably be a charged polymer, e.g., a polycationic chain. The 
condensing agent can be a polylysine (PLL), spermine, spermidine, polyamine, 
pseudopeptide-polyamine, peptidomimetic polyamine, dendrimer polyamine, arginine, 
amidine, protamine, cationic lipid, cationic porphyrin, quartemary salt of a polyamine, or an 
alpha helical peptide. 

A condensing agent can have the following characteristics: (a) at least IDa in size; (b) 
has at least 2 charged groups, preferably between 2 and 100 charged groups; (c) is present in 
the complex at a ratio of at least 1:1 condensing agent to carrier agent; (d) is present in the 
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complex at a ratio of at least 1:1 condensing agent to fusogenic agent; (e) is present in the 
complex at a ratio of at least 1 : 1 condensing agent to targeting agent. 

Other suitable condensing agents can be tested and identified by a skilled artisan, e.g., 
by evaluating the ability of a test agent to interact with a nucleic acid, e.g., an iRNA agent. 
The ability of a test agent to interact with a nucleic acid, e.g., an iRNA agent, e.g., to 
condense or protect the iRNA agent, can be evaluated by routine techniques. In one assay, a 
test agent is contacted with a nucleic acid, and the size and/or charge of the contacted nucleic 
acid is evaluated by a technique suitable to detect changes in molecular mass and/or charge. 
Such techniques include non-denaturing gel electrophoresis, immunological methods, e.g., 
immunoprecipitation, gel filtration, ionic interaction chromatography, and the like. A test 
agent is identified as a condensing agent if it changes the mass and/or charge (preferably 
both) of the contacted nucleic acid, compared to a control. A two-step assay can also be 
performed, wherein a first assay evaluates the ability of a test compound alone to interact 
with, e.g., bind to, e.g., condense the charge and/or mass of, a nucleic cid; and a second assay 
evaluates the ability of a modular complex that includes the test compound to interact with, 
e.g., bind to, e.g., condense the charge and/or mass of, a nucleic acid. 

Am phipathic Delivery Agents 

In one aspect, the invention features an amphipathic delivery conjugate or module, 
such as those described herein and those described in copending, co-owned United States 
Provisional Application Serial No. 60/455,050 (Attorney Docket No. 14174-065P01), filed 
on March 13, 2003, which is hereby incorporated by reference. 

In addition, the invention include an iRNA agent described herein, e.g., a palindromic 
iRNA agent, an iRNA agent hving a non canonical pairing, an iRNA agent which targets a 
gene described herein, e.g., a gene active in the liver, an iRNA agent having a chemical 
modification described herein, e.g., a modification which enhances resistance to degradation, 
an iRNA agent having an architecture or structure described herein, an iRNA agent 
administered as described herein, or an iRNA agent formulated as described herein, 
combined with, associated with, and delivered by such an amphipathic delivery conjugate. 

An amphipathic molecule is a molecule having a hydrophobic and a hydrophilic 
region. Such molecules can interact with (e.g., penetrate or disrupt) lipids, e.g., a lipid 
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bylayer of a cell. As such, they can serve as delivery agent for an associated (e.g., bound) 
iRNA (e.g., an iRNA or sRNA described herein). A preferred amphipathic molecule to be 
used in the compositions described herein (e.g., the amphipathic iRNA constructs descriebd 
herein) is a polymer. The polymer may have a secondary structure, e.g., a repeating 
secondary structure. 

One example of an amphipathic polymer is an amphipathic polypeptide, e.g., a 
polypeptide having a secondary structure such that the polypeptide has a hydrophilic and a 
hybrophobic face. The design of amphipathic peptide structures (e.g., alpha-helical 
polypeptides) is routine to one of skill in the art. For example, the following references 
provide guidance: Grell et al. (2001) Protein design and folding: template trapping of self- 
assembled helical bundles J Pept Sci 7(3): 146-51; Chen et al. (2002) Determination of 
stereochemistry stability coefficients of amino acid side-chains in an amphipathic alpha-helix 
J Pept Res 59(l):18-33; Iwata et al. (1994) Design and synthesis of amphipathic 3(10)-helical 
peptides and their interactions with phospholipid bilayers and ion channel formation J Biol 
Chem 269(7):4928-33; Comut et al. (1994) The amphipathic alpha-helix concept. 
Application to the de novo design of ideally amphipathic Leu, Lys peptides with hemolytic 
activity higher than that ofmelittin FEBS Lett 349(l):29-33; Negrete et al. (1998) 
Deciphering the structural code for proteins: helical propensities in domain classes and 
statistical multiresidue information in alpha-helices. Protein Sci 7(6): 1368-79. 

Another example of an amphipathic polymer is a polymer made up of two or more 
amphipathic subunits, e.g., two or more subunits containing cyclic moieties (e.g., a cyclic 
moiety having one or more hydrophilic groups and one or more hydrophobic groups). For 
example, the subunit may contain a steroid, e.g., cholic acid; or a aromatic moiety. Such 
moieties preferably can exhibit atropisomerism, such that they can form opposing 
hydrophobic and hydrophilic faces when in a polymer structure. 

The ability of a putative amphipathic molecule to interact with a lipid membrane, e.g., 
a cell membrane, can be tested by routine methods, e.g., in a cell free or cellular assay. For 
example, a test compound is combined or contacted with a synthetic lipid bilayer, a cellular 
membrane fraction, or a cell, and the test compound is evaluated for its ability to interact 
with, penetrate or disrupt the lipid bilayer, cell membrane or cell. The test compound can 
labeled in order to detect the interaction with the lipid bilayer, cell membrane or cell. In 
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another type of assay, the test compound is linked to a reporter molecule or an iRNA agent 
(e.g., an iRNA or sRNA described herein) and the ability of the reporter molecule or iRNA 
agent to penetrate the lipid bilayer, cell membrane or cell is evaluated. A two-step assay can 
also be performed, wherein a first assay evaluates the ability of a test compound alone to 
interact with a lipid bilayer, cell membrane or cell; and a second assay evaluates the ability of 
a construct (e.g., a construct described herein) that includes the test compound and a reporter 
or iRNA agent to interact with a lipid bilayer, cell membrane or cell. 

An amphipathic polymer useful in the compositions described herein has at least 2, 
preferably at least 5, more preferably at least 10, 25, 50, 100, 200, 500, 1000, 2000, 50000 or 
more subunits (e.g., amino acids or cyclic subunits). A single amphipathic polymer can be 
linked to one or more, e.g., 2, 3, 5, 10 or more iRNA agents (e.g., iRNA or sRNA agents 
described herein). In some embodiments, an amphipathic polymer can contain both amino 
acid and cyclic subunits, e.g., aromatic subunits. 

The invention features a composition that includes an iRNA agent (e.g., an iRNA or 
sRNA described herein) in association with an amphipathic molecule. Such compositions 
may be referred to herein as "amphipathic iRNA constructs." Such compositions and 
constructs are useful in the delivery or targeting of iRNA agents, e.g., delivery or targeting 
of iRNA agents to a cell. While not wanting to be bound by theory, such compositions and 
constructs can increase the porosity of, e.g., can penetrate or disrupt, a lipid (e.g., a lipid 
bilayer of a cell), e.g., to allow entry of the iRNA agent into a cell. 

In one aspect, the invention relates to a composition comprising an iRNA agent (e.g., 
an iRNA or sRNA agent described herein) linked to an amphipathic molecule. The iRNA 
agent and the amphipathic molecule may be held in continuous contact with one another by 
either covalent or noncovalent linkages. 

The amphipathic molecule of the composition or construct is preferably other than a 
phospholipid, e.g., other than a micelle, membrane or membrane fragment. 

The amphipathic molecule of the composition or construct is preferably a polymer. 
The polymer may include two or more amphipathic subunits. One or more hydrophilic 
groups and one or more hydrophobic groups may be present on the polymer. The polymer 
may have a repeating secondary structure as well as a first face and a second face. The 
distribution of the hydrophilic groups and the hydrophobic groups along the repeating 
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secondary structure can be such that one face of the polymer is a hydrophilic face and the 
other face of the polymer is a hydrophobic face. 

The amphipathic molecule can be a polypeptide, e.g, a polypeptide comprising an 
a-helical conformation as its secondary structure. 

In one embodiment, the amphipathic polymer includes one or more subunits 
containing one or more cyclic moiety (e.g., a cyclic moiety having one or more hydrophilic 
groups and/or one or more hydrophobic groups). In one embodiment, the polymer is a 
polymer of cyclic moieties such that the moieties have alternating hydrophobic and 
hydrophilic groups. For example, the subunit may contain a steroid, e.g., cholic acid. In 
another example, the subunit may contain an aromatic moiety. The aromatic moiety may be 
one that can exhibit atropisomerism, e.g., a 2,2'-bis(substituted)-l-l '-binaphthyl or a 2,2'- 
bis(substituted) biphenyl. A subunit may include an aromatic moiety of Formula (M): 




(M) 

The invention features a composition that includes an iRNA agent (e.g., an iRNA or 
sRNA described herein) in association with an amphipathic molecule. Such compositions 
may be referred to herein as "amphipathic iRNA constructs." Such compositions and 
constructs are useful in the delivery or targeting of iRNA agents, e.g., delivery or targeting 
of iRNA agents to a cell. While not wanting to be bound by theory, such compositions and 
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constructs can increase the porosity of, e.g., can penetrate or disrupt, a lipid (e.g., a lipid 
bilayer of a cell), e.g., to allow entry of the iRNA agent into a cell. 

In one aspect, the invention relates to a composition comprising an iRNA agent (e.g., 
an iRNA or sRNA agent described herein) linked to an amphipathic molecule. The iRNA 
5 agent and the amphipathic molecule may be held in continuous contact with one another by 
either covalent or noncovalent linkages. 

The amphipathic molecule of the composition or construct is preferably other than a 
phospholipid, e.g., other than a micelle, membrane or membrane fragment. 

The amphipathic molecule of the composition or construct is preferably a polymer. 
10 The polymer may include two or more amphipathic subunits. One or more hydrophilic 
groups and one or more hydrophobic groups may be present on the polymer. The polymer 
may have a repeating secondary structure as well as a first face and a second face. The 
distribution of the hydrophilic groups and the hydrophobic groups along the repeating 
secondary structure can be such that one face of the polymer is a hydrophilic face and the 
1 5 other face of the polymer is a hydrophobic face. 

The amphipathic molecule can be a polypeptide, e.g., a polypeptide comprising an 
a-helical conformation as its secondary structure. 

In one embodiment, the amphipathic polymer includes one or more subunits 
containing one or more cyclic moiety (e.g., a cyclic moiety having one or more hydrophilic 
20 groups and/or one or more hydrophobic groups). In one embodiment, the polymer is a 
polymer of cyclic moieties such that the moieties have alternating hydrophobic and 
hydrophilic groups. For example, the subunit may contain a steroid, e.g., cholic acid. In 
another example, the subunit may contain an aromatic moiety. The aromatic moiety may be 
one that can exhibit atropisomerism, e.g., a 2,2'-bis(substituted)-l-l '-binaphthyl or a 2,2'- 
25 bis(substituted) biphenyl. A subunit may include an aromatic moiety of Formula (M): 
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(M) 

Referring to Formula M, Ri is C1-C100 alkyl optionally substituted with aryl, alkenyl, 
alkynyl, alkoxy or halo and/or optionally inserted with O, S, alkenyl or alkynyl; C r C 100 
perfluoroalkyl; or OR 5 . 

R 2 is hydroxy; nitro; sulfate; phosphate; phosphate ester; sulfonic acid; OR 6 ; or C r 
Cioo alkyl optionally substituted with hydroxy, halo, nitro, aryl or alkyl sulfinyl, aryl or alkyl 
sulfonyl, sulfate, sulfonic acid, phosphate, phosphate ester, substituted or unsubstituted aryl, 
carboxyl, carboxylate, amino carbonyl, or alkoxycarbonyl, and/or optionally inserted with O, 
NH, S, S(O), S0 2 , alkenyl, or alkynyl. 

R 3 is hydrogen, or when taken together with R4 froms a fused phenyl ring. 

R4 is hydrogen, or when taken together with R 3 froms a fused phenyl ring. 

R 5 is C1-C100 alkyl optionally substituted with aryl, alkenyl, alkynyl, alkoxy or halo 
and/or optionally inserted with O, S, alkenyl or alkynyl; or C r Cioo perfluoroalkyl; and R 6 is 
Ci-C,oo alkyl optionally substituted with hydroxy, halo, nitro, aryl or alkyl sulfinyl, aryl or 
alkyl sulfonyl, sulfate, sulfonic acid, phosphate, phosphate ester, substituted or unsubstituted 
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aryl, carboxyl, carboxylate, amino carbonyl, or alkoxycarbonyl, and/or optionally inserted 
with O, NH, S, S(0), S0 2 , alkenyl, or alkynyl. 

Increasing cellular uptake of dsRNAs 
5 A method of the invention that can include the administration of an iRNA agent and a 

drug that affects the uptake of the iRNA agent into the cell. The drug can be administered 
before, after, or at the same time that the iRNA agent is administered. The drug can be 
covalently linked to the iRNA agent. The drug can be, for example, a lipopolysaccharide, an 
activator of p38 MAP kinase, or an activator of NF-kB. The drug can have a transient effect 
10 on the cell. 

The drug can increase the uptake of the iRNA agent into the cell, for example, by 
disrupting the cell's cytoskeleton, e.g., by disrupting the cell's microtubules, microfilaments, 
and/or intermediate filaments. The drug can be, for example, taxon, vincristine, vinblastine, 
cytochalasin, nocodazole, japlakinolide, latrunculin A, phalloidin, swinholide A, indanocine, 
15 ormyoservin. 

The drug can also increase the uptake of the iRNA agent into the cell by activating an 
inflammatory response, for example. Exemplary drug's that would have such an effect 
include tumor necrosis factor alpha (TNF alpha), interleukin-1 beta, or gamma interferon. 

20 iRNA conjugates 

An iRNA agent can be coupled, e.g., covalently coupled, to a second agent. For 
example, an iRNA agent used to treat a particular disorder can be coupled to a second 
therapeutic agent, e.g., an agent other than the iRNA agent. The second therapeutic agent 
can be one which is directed to the treatment of the same disorder. For example, in the case 

25 of an iRNA used to treat a disorder characterized by unwanted cell proliferation, e.g., cancer, 
the iRNA agent can be coupled to a second agent which has an anti-cancer effect. For 
example, it can be coupled to an agent which stimulates the immune system, e.g., a CpG 
motif, or more generally an agent that activates a toll-like receptor and/or increases the 
production of gamma interferon. 
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iRNA Production 

An iRNA can be produced, e.g., in bulk, by a variety of methods. Exemplary 
methods include: organic synthesis and RNA cleavage, e.g., in vitro cleavage. 

5 Organic Synthesis 

An iRNA can be made by separately synthesizing each respective strand of a double- 
stranded RNA molecule. The component strands can then be annealed. 

A large bioreactor, e.g., the OligoPilot II from Pharmacia Biotec AB (Uppsala 
Sweden), can be used to produce a large amount of a particular RNA strand for a given 

10 iRNA. The OligoPilotll reactor can efficiently couple a nucleotide using only a 1 .5 molar 
excess of a phosphoramidite nucleotide. To make an RNA strand, ribonucleotides amidites 
are used. Standard cycles of monomer addition can be used to synthesize the 21 to 23 
nucleotide strand for the iRNA. Typically, the two complementary strands are produced 
separately and then annealed, e.g., after release from the solid support and deprotection. 

1 5 Organic synthesis can be used to produce a discrete iRNA species. The 

complementary of the species to a particular target gene can be precisely specified. For 
example, the species may be complementary to a region that includes a polymorphism, e.g., a 
single nucleotide polymorphism. Further the location of the polymorphism can be precisely 
defined. In some embodiments, the polymorphism is located in an internal region, e.g., at 

20 least 4, 5, 7, or 9 nucleotides from one or both of the termini. 

dsRNA Cleavage 

iRNAs can also be made by cleaving a larger ds iRNA. The cleavage can be 
mediated in vitro or in vivo. For example, to produce iRNAs by cleavage in vitro, the 
following method can be used: 

25 In vitro transcription. dsRNA is produced by transcribing a nucleic acid (DNA) 

segment in both directions. For example, the HiScribe™ RNAi transcription kit (New 
England Biolabs) provides a vector and a method for producing a dsRNA for a nucleic acid 
segment that is cloned into the vector at a position flanked on either side by a T7 promoter. 
Separate templates are generated for T7 transcription of the two complementary strands for 

30 the dsRNA. The templates are transcribed in vitro by addition of T7 RNA polymerase and 
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dsRNA is produced. Similar methods using PCR and/or other RNA polymerases (e.g., T3 or 
SP6 polymerase) can also be used. In one embodiment, RNA generated by this method is 
carefully purified to remove endotoxins that may contaminate preparations of the 
recombinant enzymes. 

In vitro cleavage. dsRNAis cleaved in vitro into iRNAs, for example, using a Dicer 
or comparable RNAse Ill-based activity. For example, the dsRNA can be incubated in an in 
vitro extract from Drosophila or using purified components, e.g. a purified RNAse or RISC 
complex (RNA-induced silencing complex ). See, e.g., Ketting et al. Genes Dev 2001 Oct 
15;15(20):2654-9. and Hammond Science 2001 Aug 10;293(5532):1146-50. 

dsRNA cleavage generally produces a plurality of iRNA species, each being a 
particular 21 to 23 nt fragment of a source dsRNA molecule. For example, iRNAs that 
include sequences complementary to overlapping regions and adjacent regions of a source 
dsRNA molecule may be present. 

Regardless of the method of synthesis, the iRNA preparation can be prepared in a 
solution (e.g., an aqueous and/or organic solution) that is appropriate for formulation. For 
example, the iRNA preparation can be precipitated and redissolved in pure double-distilled 
water, and lyophilized. The dried iRNA can then be resuspended in a solution appropriate for 
the intended formulation process. 

Synthesis of modified and nucleotide surrogate iRNA agents is discussed below. 

FORMULATION 

The iRNA agents described herein can be formulated for administration to a subject 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. 

A formulated iRNA composition can assume a variety of states. In some examples, 
the composition is at least partially crystalline, uniformly crystalline, and/or anhydrous (e.g., 
less than 80, 50, 30, 20, or 10% water). In another example, the iRNA is in an aqueous 
phase, e.g., in a solution that includes water. 

The aqueous phase or the crystalline compositions can, e.g., be incorporated into a 
delivery vehicle, e.g., a liposome (particularly for the aqueous phase) or a particle (e.g., a 
172 



WO 2004/080406 



PCT/US2004/007070 



microparticle as can be appropriate for a crystalline composition). Generally, the iRNA 
composition is formulated in a manner that is compatible with the intended method of 
administration (see, below). 

In particular embodiments, the composition is prepared by at least one of the 
following methods: spray drying, lyophilization, vacuum drying, evaporation, fluid bed 
drying, or a combination of these techniques; or sonication with a lipid, freeze-drying, 
condensation and other self-assembly. 

A iRNA preparation can be formulated in combination with another agent, e.g., 
another therapeutic agent or an agent that stabilizes a iRNA, e.g., a protein that complexes 
with iRNA to form an iRNP. Still other agents include chelators, e.g., EDTA (e.g., to 
remove divalent cations such as Mg 24 ), salts, RNAse inhibitors (e.g., a broad specificity 
RNAse inhibitor such as RNAsin) and so forth. 

In one embodiment, the iRNA preparation includes another iRNA agent, e.g., a 
second iRNA that can mediated RNAi with respect to a second gene, or with respect to the 
same gene. Still other preparation can include at least 3,5, ten, twenty, fifty, or a hundred or 
more different iRNA species. Such iRNAs can mediated RNAi with respect to a similar 
number of different genes. 

In one embodiment, the iRNA preparation includes at least a second therapeutic agent 
(e.g., an agent other than an RNA or a DNA). For example, a iRNA composition for the 
treatment of a viral disease, e.g. HIV, might include a known antiviral agent (e.g., a protease 
inhibitor or reverse transcriptase inhibitor). In another example, a iRNA composition for the 
treatment of a cancer might further comprise a chemotherapeutic agent. 
Exemplary formulations are discussed below: 

Liposomes 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA s agents, and such practice is within the invention. An iRNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA 
agent which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) preparation can be 
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formulated for delivery in a membranous molecular assembly, e.g., a liposome or a micelle. 
As used herein, the term "liposome" refers to a vesicle composed of amphiphilic lipids 
arranged in at least one bilayer, e.g., one bilayer or a plurality of bilayers. Liposomes include 
unilamellar and multilamellar vesicles that have a membrane formed from a lipophilic 
material and an aqueous interior. The aqueous portion contains the iRNA composition. The 
lipophilic material isolates the aqueous interior from an aqueous exterior, which typically 
does not include the iRNA composition, although in some examples, it may. Liposomes are 
useful for the transfer and delivery of active ingredients to the site of action. Because the 
liposomal membrane is structurally similar to biological membranes, when liposomes are 
applied to a tissue, the liposomal bilayer fuses with bilayer of the cellular membranes. As the 
merging of the liposome and cell progresses, the internal aqueous contents that include the 
iRNA are delivered into the cell where the iRNA can specifically bind to a target RNA and 
can mediate RNAi. In some cases the liposomes are also specifically targeted, e.g., to direct 
the iRNA to particular cell types. 

A liposome containing a iRNA can be prepared by a variety of methods. 
In one example, the lipid component of a liposome is dissolved in a detergent so that 
micelles are formed with the lipid component. For example, the lipid component can be an 
amphipathic cationic lipid or lipid conjugate. The detergent can have a high critical micelle 
concentration and may be nonionic. Exemplary detergents include cholate, CHAPS, 
octylglucoside, deoxycholate, and lauroyl sarcosine. The iRNA preparation is then added to 
the micelles that include the lipid component. The cationic groups on the lipid interact with 
the iRNA and condense around the iRNA to form a liposome. After condensation, the 
detergent is removed, e.g. , by dialysis, to yield a liposomal preparation of iRNA. 

If necessary a carrier compound that assists in condensation can be added during the 
condensation reaction, e.g., by controlled addition. For example, the carrier compound can 
be a polymer other than a nucleic acid (e.g., spermine or spermidine). pH can also adjusted 
to favor condensation. 

Further description of methods for producing stable polynucleotide delivery vehicles, 
which incorporate a polynucleotide/cationic lipid complex as structural components of the 
delivery vehicle, are described in, e.g., WO 96/37194. Liposome formation can also include 
one or more aspects of exemplary methods described in Feigner, P. L. et al, Proc. Natl. 
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Acad. Sci., USA 8:7413-7417, 1987; U.S. Pat. No. 4,897,355; U.S. Pat. No. 5,171,678; 
Bangham, etal. M. Mol. Biol. 23:238, 1965; Olson, etal. Biochim. Biophys. Acta 557:9, 
1979; Szoka, et al. Proc. Natl. Acad. Sci. 75: 4194, 1978; Mayhew, et al. Biochim. Biophys. 
Acta 775:169, 1984; Kim, et al. Biochim. Biophys. Acta 728:339, 1983; and Fukunaga, et al. 

5 Endocrinol. 115:757, 1984. Commonly used techniques for preparing lipid aggregates of 
appropriate size for use as delivery vehicles include sonication and freeze-thaw plus 
extrusion (see, e.g., Mayer, et al. Biochim. Biophys. Acta 858:161, 1986). Microfluidization 
can be used when consistently small (50 to 200 nm) and relatively uniform aggregates are 
desired (Mayhew, et al. Biochim. Biophys. Acta 775:169, 1984). These methods are readily 

1 o adapted to packaging iRNA preparations into liposomes. 

Liposomes that are pH-sensitive or negatively-charged, entrap nucleic acid molecules 
rather than complex with them. Since both the nucleic acid molecules and the lipid are 
similarly charged, repulsion rather than complex formation occurs. Nevertheless, some 
nucleic acid molecules are entrapped within the aqueous interior of these liposomes. pH- 

1 5 sensitive liposomes have been used to deliver DNA encoding the thymidine kinase gene to 
cell monolayers in culture. Expression of the exogenous gene was detected in the target cells 
(Zhou et al, Journal of Confrolled Release, 19, (1992) 269-274). 

One major type of liposomal composition includes phospholipids other than 
naturally-derived phosphatidylcholine. Neutral liposome compositions, for example, can be 

20 formed from dimyristoyl phosphatidylcholine (DMPC) or dipalmitoyl phosphatidylcholine 
(DPPC). Anionic liposome compositions generally are formed from dimyristoyl 
phosphatidylglycerol, while anionic fusogenic liposomes are formed primarily from dioleoyl 
phosphatidylethanolamine (DOPE). Another type of liposomal composition is formed from 
phosphatidylcholine (PC) such as, for example, soybean PC, and egg PC. Another type is 

25 formed from mixtures of phospholipid and/or phosphatidylcholine and/or cholesterol. 

Examples of other methods to introduce liposomes into cells in vitro and in vivo 
include U.S. Pat. No. 5,283,185; U.S. Pat. No. 5,171,678; WO 94/00569; WO 93/24640; WO 
91/16024; Feigner, J. Biol. Chem. 269:2550, 1994; Nabel, Proc. Natl. Acad. Sci. 90:1 1307, 
1993; Nabel, Human Gene Ther. 3:649, 1992; Gershon, Biochem. 32:7143, 1993; and Strauss 

30 £MBOJ. 11:417, 1992. 
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In one embodiment, cationic liposomes are used. Cationic liposomes possess the 
advantage of being able to fuse to the cell membrane. Non-cationic liposomes, although not 
able to fuse as efficiently with the plasma membrane, are taken up by macrophages in vivo 
and can be used to deliver iRNAs to macrophages. 

5 Further advantages of liposomes include: liposomes obtained from natural 

phospholipids are biocompatible and biodegradable; liposomes can incorporate a wide range 
of water and lipid soluble drugs; liposomes can protect encapsulated iRNAs in their internal 
compartments from metabolism and degradation (Rosoff, in "Pharmaceutical Dosage 
Forms," Lieberman, Rieger and Banker (Eds.), 1988, volume 1, p. 245). Important 

1 o considerations in the preparation of liposome formulations are the lipid surface charge, 
vesicle size and the aqueous volume of the liposomes. 

A positively charged synthetic cationic lipid, N-[l-(2,3-dioleyloxy)propyl]-N,N,N- 
trimethylammonium chloride (DOTMA) can be used to form small liposomes that interact 
spontaneously with nucleic acid to form lipid-nucleic acid complexes which are capable of 

1 5 fusing with the negatively charged lipids of the cell membranes of tissue culture cells, 
resulting in delivery of iRNA (see, e.g., Feigner, P. L. et ah, Proc. Natl. Acad. Sci., USA 
8:7413-7417, 1987 and U.S. Pat. No. 4,897,355 for a description of DOTMA and its use with 
DNA). 

A DOTMA analogue, l,2-bis(oleoyloxy)-3-(trimethylammonia)propane (DOTAP) 
20 can be used in combination with a phospholipid to form DNA-complexing vesicles. 

Lipofectin™ Bethesda Research Laboratories, Gaithersburg, Md.) is an effective agent for 
the delivery of highly anionic nucleic acids into living tissue culture cells that comprise 
positively charged DOTMA liposomes which interact spontaneously with negatively charged 
polynucleotides to form complexes. When enough positively charged liposomes are used, 
25 the net charge on the resulting complexes is also positive. Positively charged complexes 
prepared in this way spontaneously attach to negatively charged cell surfaces, fuse with the 
plasma membrane, and efficiently deliver functional nucleic acids into, for example, tissue 
culture cells. Another commercially available cationic lipid, l,2-bis(oleoyloxy)-3,3- 
(trimethylammonia)propane ("DOTAP") (Boehringer Mannheim, Indianapolis, Indiana) 
30 differs from DOTMA in that the oleoyl moieties are linked by ester, rather than ether 
linkages. 
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Other reported cationic lipid compounds include those that have been conjugated to a 
variety of moieties including, for example, carboxyspermine which has been conjugated to 
one of two types of lipids and includes compounds such as 5-carboxyspermylglycine 
dioctaoleoylamide ("DOGS") (Transfectam™, Promega, Madison, Wisconsin) and 
5 dipalmitoylphosphatidylethanolamine 5-carboxyspermyl-amide ("DPPES") (see, e.g., U.S. 
Pat. No. 5,171,678). 

Another cationic lipid conjugate includes derivatization of the lipid with cholesterol 
("DC-Choi") which has been formulated into liposomes in combination with DOPE (See, 
Gao, X. and Huang, L., Biochim. Biophys. Res. Commun. 179:280, 1991). Lipopolylysine, 
1 o made by conjugating polylysine to DOPE, has been reported to be effective for transfection 
in the presence of serum (Zhou, X. et al, Biochim. Biophys. Acta 1065:8, 1991). For certain 
cell lines, these liposomes containing conjugated cationic lipids, are said to exhibit lower 
toxicity and provide more efficient transfection than the DOTMA-containing compositions. 
Other commercially available cationic lipid products include DMRIE and DMRIE-HP 
1 5 (Vical, La Jolla, California) and Lipofectamine (DOSPA) (Life Technology, Inc., 

Gaithersburg, Maryland). Other cationic lipids suitable for the delivery of oligonucleotides 
are described in WO 98/39359 and WO 96/37194. 

Liposomal formulations are particularly suited for topical administration, liposomes 
present several advantages over other formulations. Such advantages include reduced side 
20 effects related to high systemic absorption of the administered drug, increased accumulation 
of the administered drug at the desired target, and the ability to administer iRNA, into the 
skin. In some implementations, liposomes are used for delivering iRNA to epidermal cells 
and also to enhance the penetration of iRNA into dermal tissues, e.g., into skin. For example, 
the liposomes can be applied topically. Topical delivery of drugs formulated as liposomes to 
25 the skin has been documented (see, e.g., Weiner et al, Journal of Drug Targeting, 1 992, vol. 
2,405-410 and du Plessis et al, Antiviral Research, 18, 1992, 259-265; Mannino, R. J. and 
Fould-Fogerite, S., Biotechniques 6:682-690, 1988; Itani, T. et al. Gene 56:267-276. 1987; 
Nicolau, C. et al. Meth. Enz. 149:157-176, 1987; Straubinger, R. M. and Papahadjopoulos, 
D. Meth. Enz. 101:512-527, 1983; Wang, C. Y. and Huang, L., Proc. Natl. Acad. Sci. USA 
30 84:7851-7855, 1987). 
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Non-ionic liposomal systems have also been examined to determine their utility in the 
delivery of drugs to the skin, in particular systems comprising non-ionic surfactant and 
cholesterol. Non-ionic liposomal formulations comprising Novasome I (glyceryl 
dilaurate/cholesterol/polyoxyethylene-10-stearyl ether) and Novasome II (glyceryl distearate/ 
cholesterol/polyoxyethylene-10-stearyl ether) were used to deliver a drug into the dermis of 
mouse skin. Such formulations with iRNA are useful for treating a dermatological disorder. 

Liposomes that include iRNA can be made highly deformable. Such deformability 
can enable the liposomes to penetrate through pore that are smaller than the average radius of 
the liposome. For example, transfersomes are a type of deformable liposomes. 
Transferosomes can be made by adding surface edge activators, usually surfactants, to a 
standard liposomal composition. Transfersomes that include iRNA can be delivered, for 
example, subcutaneously by infection in order to deliver iRNA to keratinocytes in the skin. 
In order to cross intact mammalian skin, lipid vesicles must pass through a series of fine 
pores, each with a diameter less than 50 nm, under the influence of a suitable transdermal 
gradient. In addition, due to the lipid properties, these transferosomes can be self-optimizing 
(adaptive to the shape of pores, e.g., in the skin), self-repairing, and can frequently reach 
their targets without fragmenting, and often self-loading. The iRNA agents can include an 
RRMS tethered to a moiety which improves association with a liposome. 

Surfactants 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these fommlations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. Surfactants find wide 
application in formulations such as emulsions (including microemulsions) and liposomes (see 
above). iRNA (or a precursor, e.g., a larger dsRNA which can be processed into a iRNA, or 
a DNA which encodes a iRNA or precursor) compositions can include a surfactant. In one 
embodiment, the iRNA is formulated as an emulsion that includes a surfactant. The most 
common way of classifying and ranking the properties of the many different types of 
surfactants, both natural and synthetic, is by the use of the hydrophile/lipophile balance 
(HLB). The nature of the hydrophilic group provides the most useful means for categorizing 
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the different surfactants used in formulations (Rieger, in "Pharmaceutical Dosage Forms," 

Marcel Dekker, Inc., New York, NY, 1988, p. 285). 

If the surfactant molecule is not ionized, it is classified as a nonionic surfactant. 

Nonionic surfactants find wide application in pharmaceutical products and are usable over a 
5 wide range of pH values. In general their HLB values range from 2 to about 1 8 depending 

on their structure. Nonionic surfactants include nonionic esters such as ethylene glycol 

esters, propylene glycol esters, glyceryl esters, polyglyceryl esters, sorbitan esters, sucrose 

esters, and ethoxylated esters. Nonionic alkanolamides and ethers such as fatty alcohol 

ethoxylates, propoxylated alcohols, and ethoxylated/propoxylated block polymers are also 
10 included in tins class. The polyoxyethylene surfactants are the most popular members of the 

nonionic surfactant class. 

If the surfactant molecule carries a negative charge when it is dissolved or dispersed 

in water, the surfactant is classified as anionic. Anionic surfactants include carboxylates 

such as soaps, acyl lactylates, acyl amides of amino acids, esters of sulfuric acid such as alkyl 
15 sulfates and ethoxylated alkyl sulfates, sulfonates such as alkyl benzene sulfonates, acyl 

isethionates, acyl taurates and sulfosuccinates, and phosphates. The most important members 

of the anionic surfactant class are the alkyl sulfates and the soaps. 

If the surfactant molecule carries a positive charge when it is dissolved or dispersed in 

water, the surfactant is classified as cationic. Cationic surfactants include quaternary 
20 ammonium salts and ethoxylated amines. The quaternary ammonium salts are the most used 

members of this class. 

If the surfactant molecule has the ability to carry either a positive or negative charge, 

the surfactant is classified as amphoteric. Amphoteric surfactants include acrylic acid 

derivatives, substituted alkylamides, N-alkylbetaines and phosphatides. 
25 The use of surfactants in drug products, formulations and in emulsions has been 

reviewed (Rieger, in "Pharmaceutical Dosage Forms," Marcel Dekker, Inc., New York, NY, 
1988, p. 285). 

Micelles and other Membranous Formulations 

For ease of exposition the micelles and other formulations, compositions and methods 
30 in this section are discussed largely with regard to unmodified iRNA agents. It should be 
understood, however, that these micelles and other formulations, compositions and methods 
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can be practiced with other iRNA agents, e.g., modified iRNA agents, and such practice is 
within the invention. The iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
5 precursor thereof)) composition can be provided as a micellar formulation. "Micelles" are 
defined herein as a particular type of molecular assembly in which amphipathic molecules 
are arranged in a spherical structure such that all the hydrophobic portions of the molecules 
are directed inward, leaving the hydrophilic portions in contact with the surrounding aqueous 
phase. The converse arrangement exists if the environment is hydrophobic. 
1 o A mixed micellar formulation suitable for delivery through transdermal membranes 

may be prepared by mixing an aqueous solution of the iRNA composition, an alkali metal Cg 
to C 22 alkyl sulphate, and a micelle forming compounds. Exemplary micelle forming 
compounds include lecithin, hyaluronic acid, pharmaceutically acceptable salts of hyaluronic 
acid, glycolic acid, lactic acid, chamomile extract, cucumber extract, oleic acid, linoleic acid, 
15 linolenic acid, monoolein, monooleates, monolaurates, borage oil, evening of primrose oil, 
menthol, trihydroxy oxo cholanyl glycine and pharmaceutically acceptable salts thereof, 
glycerin, polyglycerin, lysine, polylysine, triolein, polyoxyethylene ethers and analogues 
thereof, polidocanol alkyl ethers and analogues thereof, chenodeoxycholate, deoxycholate, 
and mixtures thereof. The micelle forming compounds may be added at the same time or 
20 after addition of the alkali metal alkyl sulphate. Mixed micelles will form with substantially 
any kind of mixing of the ingredients but vigorous mixing is preferred in order to provide 
smaller size micelles. 

In one method a first micellar composition is prepared which contains the iRNA 
composition and at least the alkali metal alkyl sulphate. The first micellar composition is then 
25 mixed with at least three micelle forming compounds to form a mixed micellar composition. 
In another method, the micellar composition is prepared by mixing the iRNA composition, 
the alkali metal alkyl sulphate and at least one of the micelle forming compounds, followed 
by addition of the remaining micelle forming compounds, with vigorous mixing. 

Phenol and/or m-cresol may be added to the mixed micellar composition to stabilize 
30 the formulation and protect against bacterial growth. Alternatively, phenol and/or m-cresol 
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may be added with the micelle forming ingredients. An isotonic agent such as glycerin may 
also be added after formation of the mixed micellar composition. 

For delivery of the micellar formulation as a spray, the formulation can he put into an 
aerosol dispenser and the dispenser is charged with a propellant. The propellant, which is 
under pressure, is in liquid form in the dispenser. The ratios of the ingredients are adjusted 
so that the aqueous and propellant phases become one, i.e. there is one phase. If there are 
two phases, it is necessary to shake the dispenser prior to dispensing a portion of the 
contents, e.g. through a metered valve. The dispensed dose of pharmaceutical agent is 
propelled from the metered valve in a fine spray. 

The preferred propellants are hydrogen-containing chlorofluorocarbons, hydrogen- 
containing fluorocarbons, dimethyl ether and diethyl ether. Even more preferred is HFA 134a 
(1,1,1,2 tetrafluoroethane). 

The specific concentrations of the essential ingredients can be determined by 
relatively straightforward experimentation. For absorption through the oral cavities, it is 
often desirable to increase, e.g. at least double or triple, the dosage for through injection or 
administration through the gastrointestinal tract. 

The iRNA agents can include an RRMS tethered to a moiety which improves 
association with a micelle or other membranous formulation. 

Particles 

For ease of exposition the particles, formulations, compositions and methods in this 
section are discussed largely with regard to unmodified iRNA agents. It should be 
understood, however, that these particles, formulations, compositions and methods can be 
practiced with other iRNA agents, e.g., modified iRNA agents, and such practice is within 
the invention. In another embodiment, an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, or precursor thereof) preparations may be incorporated into a particle, e.g., a 
microparticle. Microparticles can be produced by spray-drying, but may also be produced by 
other methods including lyophilization, evaporation, fluid bed diying, vacuum drying, or a 
combination of these techniques. See below for further description. 
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Sustained -Release Formulations. An iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof) described herein can be formulated for 
5 controlled, e.g., slow release. Controlled release can be achieved by disposing the iRNA 
within a structure or substance which impedes its release. E.g., iRNA can be disposed within 
a porous matrix or in an erodable matrix, either of which allow release of the iRNA over a 
period of time. 

Polymeric particles, e.g., polymeric in microparticles can be used as a sustained- 
1 o release reservoir of iRNA that is taken up by cells only released from the microparticle 
through biodegradation. The polymeric particles in this embodiment should therefore be 
large enough to preclude phagocytosis (e.g., larger than 10 \im and preferably larger than 20 
(am). Such particles can be produced by the same methods to make smaller particles, but with 
less vigorous mixing of the first and second emulsions. That is to say, a lower 
15 homogenization speed, vortex mixing speed, or sonication setting can be used to obtain 

particles having a diameter around 100 um rather than 10 mm The time of mixing also can be 
altered. 

Larger microparticles can be formulated as a suspension, a powder, or an implantable 
solid, to be delivered by intramuscular, subcutaneous, intradermal, intravenous, or 
20 intraperitoneal injection; via inhalation (intranasal or intrapulmonary); orally; or by 

implantation. These particles are useful for delivery of any iRNA when slow release over a 
relatively long term is desired. The rate of degradation, and consequently of release, varies 
with the polymeric formulation. 

Microparticles preferably include pores, voids, hollows, defects or other interstitial 
25 spaces that allow the fluid suspension medium to freely permeate or perfuse the particulate 
boundary. For example, the perforated microstructures can be used to form hollow, porous 
spray dried microspheres. 

Polymeric particles containing iRNA (e.g., a sRNA) can be made using a double 
emulsion technique, for instance. First, the polymer is dissolved in an organic solvent. A 
30 preferred polymer is polylactic-co-glycolic acid (PLGA), with a lactic/glycolic acid weight 
ratio of 65:35, 50:50, or 75:25. Next, a sample of nucleic acid suspended in aqueous solution 
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is added to the polymer solution and the two solutions are mixed to form a first emulsion. 
The solutions can be mixed by vortexing or shaking, and in a preferred method, the mixture 
can be sonicated. Most preferable is any method by which the nucleic acid receives the least 
amount of damage in the form of nicking, shearing, or degradation, while still allowing the 
formation of an appropriate emulsion. For example, acceptable results can be obtained with a 
Vibra-cell model VC-250 sonicator with a 1/8" microtip probe, at setting #3. 

Spray-Drying. An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof)) can be prepared by spray drying. Spray dried iRNA can be administered 
to a subject or be subjected to further formulation. A pharmaceutical composition of iRNA 
can be prepared by spray drying a homogeneous aqueous mixture that includes a iRNA under 
conditions sufficient to provide a dispersible powdered composition, e.g., a pharmaceutical 
composition. The material for spray drying can also include one or more of: a 
pharmaceutically acceptable excipient, or a dispersibility-enhancing amount of a 
physiologically acceptable, water-soluble protein. The spray-dried product can be a 
dispersible powder that includes the iRNA. 

Spray drying is a process that converts a liquid or slurry material to a dried particulate 
foroi. Spray drying can be used to provide powdered material for various administrative 
routes including inhalation. See, for example, M. Sacchetti and M. M. Van Oort in: 
Inhalation Aerosols: Physical and Biological Basis for Therapy, A. J. Hickey, ed. Marcel 
Dekkar, New York, 1996. 

Spray drying can include atomizing a solution, emulsion, or suspension to form a fine 
mist of droplets and drying the droplets. The mist can be projected into a drying chamber 
(e.g., a vessel, tank, tubing, or coil) where it contacts a drying gas. The mist can include 
solid or liquid pore forming agents. The solvent and pore forming agents evaporate from the 
droplets into the drying gas to solidify the droplets, simultaneously forming pores throughout 
the solid. The solid (typically in a powder, particulate form) then is separated from the drying 
gas and collected. 

Spray drying includes bringing together a highly dispersed liquid, and a sufficient 
volume of air (e.g., hot air) to produce evaporation and drying of the liquid droplets. The 
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preparation to be spray dried can be any solution, course suspension, slurry, colloidal 
dispersion, or paste that may be atomized using the selected spray drying apparatus. 
Typically, the feed is sprayed into a current of warm filtered air that evaporates the solvent 
and conveys the dried product to a collector. The spent air is then exhausted with the solvent. 
Several different types of apparatus may be used to provide the desired product. For example, 
commercial spray dryers manufactured by Buchi Ltd. or Niro Corp. can effectively produce 
particles of desired size. 

Spray-dried powdered particles can be approximately spherical in shape, nearly 
uniform in size and frequently hollow. There may be some degree of irregularity in shape 
depending upon the incorporated medicament and the spray drying conditions. In many 
instances the dispersion stability of spray-dried microspheres appears to be more effective if 
an inflating agent (or blowing agent) is used in their production. Particularly preferred 
embodiments may comprise an emulsion with an inflating agent as the disperse or continuous 
phase (the other phase being aqueous in nature). An inflating agent is preferably dispersed 
with a surfactant solution, using, for instance, a commercially available microfluidizer at a 
pressure of about 5000 to 15,000 psi. This process forms an emulsion, preferably stabilized 
by an incorporated surfactant, typically comprising submicron droplets of water immiscible 
blowing agent dispersed in an aqueous continuous phase. The formation of such dispersions 
using this and other techniques are common and well known to those in the art. The blowing 
agent is preferably a fluorinated compound (e.g. perfluorohexane, perfluorooctyl bromide, 
perfluorodecalin, perfluorobutyl ethane) which vaporizes during the spray-drying process, 
leaving behind generally hollow, porous aerodynamically light microspheres. As will be 
discussed in more detail below, other suitable blowing agents include chloroform, freons, and 
hydrocarbons. Nitrogen gas and carbon dioxide are also contemplated as a suitable blowing 
agent. 

Although the perforated microstructures are preferably formed using a blowing agent 
as described above, it will be appreciated that, in some instances, no blowing agent is 
required and an aqueous dispersion of the medicament and surfactant(s) are spray dried 
directly. In such cases, the formulation may be amenable to process conditions {e.g., elevated 
temperatures) that generally lead to the formation of hollow, relatively porous microparticles. 
Moreover, the medicament may possess special physicochemical properties (e.g., high 
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crystallinity, elevated melting temperature, surface activity, etc.) that make it particularly 
suitable for use in such techniques. 

The perforated microstructures may optionally be associated with, or comprise, one 
or more surfactants. Moreover, miscible surfactants may optionally be combined with the 
suspension medium liquid phase. It will be appreciated by those skilled in the art that the use 
of surfactants may further increase dispersion stability, simplify formulation procedures or 
increase bioavailability upon administration. Of course combinations of surfactants, 
including the use of one or more in the liquid phase and one or more associated with the 
perforated microstructures are contemplated as being within the scope of the invention. By 
"associated with or comprise" it is meant that the structural matrix or perforated 
microstructure may incorporate, adsorb, absorb, be coated with or be formed by the 
surfactant. 

Surfactants suitable for use include any compound or composition that aids in die 
formation and maintenance of the stabilized respiratory dispersions by forming a layer at the 
interface between the structural matrix and the suspension medium. The surfactant may 
comprise a single compound or any combination of compounds, such as in the case of co- 
surfactants. Particularly preferred surfactants are substantially insoluble in the propellant, 
nonfluorinated, and selected from the group consisting of saturated and unsaturated lipids, 
nonionic detergents, nonionic block copolymers, ionic surfactants, and combinations of such 
agents. It should be emphasized that, in addition to the aforementioned surfactants, suitable 
(i.e. biocompatible) fluorinated surfactants are compatible with the teachings herein and may 
be used to provide the desired stabilized preparations. 

Lipids, including phospholipids, from both natural and synthetic sources may be used 
in varying concentrations to form a structural matrix. Generally, compatible lipids comprise 
those that have a gel to liquid crystal phase transition greater than about 40° C. Preferably, 
the incorporated lipids are relatively long chain (i.e. C 6 -C 22 ) saturated lipids and more 
preferably comprise phospholipids. Exemplary phospholipids useful in the disclosed 
stabilized preparations comprise egg phosphatidylcholine, dilauroylphosphatidylcholine, 
dioleylphosphatidylcholine, dipalmitoylphosphatidyl-choline, disteroylphosphatidylcholine, 
short-chain phosphatidylcholines, phosphatidylethanolamine, 
dioleylphosphatidylethanolamine,phosphatidylserine,phosphatidylglycerol, 
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phosphatidylinositol, glycolipids, ganglioside GM1, sphingomyelin, phosphatide acid, 
cardiolipin; lipids bearing polymer chains such as, polyethylene glycol, chitin, hyaluronic 
acid, or polyvinylpyrroUdone; lipids bearing sulfonated mono-, di-, and polysaccharides; 
fatty acids such as palmitic acid, stearic acid, and oleic acid; cholesterol, cholesterol esters, 
and cholesterol hemisuccinate. Due to their excellent biocompatibility characteristics, 
phospholipids and combinations of phospholipids and poloxamers are particularly suitable 
for use in the stabilized dispersions disclosed herein. 

Compatible nonionic detergents comprise: sorbitan esters including sorbitan trioleate 
(Spans™ 35), sorbitan sesquioleate, sorbitan monooleate, sorbitan monolaurate, 
polyoxyethylene (20) sorbitan monolaurate, and polyoxyethylene (20) sorbitan monooleate, 
oleyl polyoxyethylene (2) ether, stearyl polyoxyethylene (2) ether, lauryl polyoxyethylene (4) 
ether, glycerol esters, and sucrose esters. Other suitable nonionic detergents can be easily 
identified using McCutcheon's Emulsifiers and Detergents (McPublishing Co., Glen Rock, 
N.J.). Preferred block copolymers include diblock and triblock copolymers of 
polyoxyethylene and polyoxypropylene, including poloxamer 188 (PluronicRTM. F68), 
poloxamer 407 (PluronicRTM. F-127), and poloxamer 338. Ionic surfactants such as sodium 
sulfosuccinate, and fatty acid soaps may also be utilized. In preferred embodiments, the 
microstructures may comprise oleic acid or its alkali salt. 

In addition to the aforementioned surfactants, cationic surfactants or lipids are 
preferred especially in the case of delivery of an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof). Examples of suitable cationic lipids include: 
DOTMA, N-[-(2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium-cliloride; DOTAP, 1 ,2- 
dioleyloxy-3-(trimethylammonio)propane; and DOTB, l,2-dioleyl-3-(4'- 
trimethylammonio)butanoyl-sn-glycerol. Polycationic amino acids such as polylysine, and 
polyarginine are also contemplated. 

For the spraying process, such spraying methods as rotary atomization, pressure 
atomization and two-fluid atomization can be used. Examples of the devices used in these 
processes include "Parubisu [phonetic rendering] Mini-Spray GA-32" and "Parubisu Spray 
Drier DL-41", manufactured by Yamato Chemical Co., or "Spray Drier CL-8," "Spray Drier 
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L-8," "Spray Drier FL-12," "Spray Drier FL-16" or "Spray Drier FL-20," manufactured by 
Okawara Kakoki Co., can be used for the method of spraying using rotary-disk atomizer. 

While no particular restrictions are placed on the gas used to dry the sprayed material, 
it is recommended to use air, nitrogen gas or an inert gas. The temperature of the inlet of the 
gas used to dry the sprayed materials such that it does not cause heat deactivation of the 
sprayed material. The range of temperatures may vary between about 50°C to about 200°C, 
preferably between about 50°C and 100°C. The temperature of the outlet gas used to dry the 
sprayed material, may vary between about 0°C and about 150°C, preferably between 0°C and 
90°C, and even more preferably between 0°C and 60°C. 

The spray drying is done under conditions that result in substantially amorphous 
powder of homogeneous constitution having a particle size that is respirable, a low moisture 
content and flow characteristics that allow for ready aerosolization. Preferably the particle 
size of the resulting powder is such that more than about 98% of the mass is in particles 
having a diameter of about 10 um or less with about 90% of the mass being in particles 
having a diameter less than 5 um. Alternatively, about 95% of the mass will have particles 
with a diameter of less than 10 um with about 80% of the mass of the particles having a 
diameter of less than 5 um. 

The dispersible pharmaceutical-based dry powders that include the iRNA preparation 
may optionally be combined with pharmaceutical carriers or excipients which are suitable for 
respiratory and pulmonary administration. Such carriers may serve simply as bulking agents 
when it is desired to reduce the iRNA concentration in the powder which is being delivered 
to a patient, but may also serve to enhance the stability of the iRNA compositions and to 
improve the dispersibility of the powder within a powder dispersion device in order to 
provide more efficient and reproducible delivery of the iRNA and to improve handling 
characteristics of the iRNA such as flowability and consistency to facilitate manufacturing 
and powder filling. 

Such carrier materials may be combined with the drug prior to spray drying, i.e., by 
adding the carrier material to the purified bulk solution. In that way, the earner particles will 
be formed simultaneously with the drug particles to produce a homogeneous powder. 
Alternatively, the carriers may be separately prepared in a dry powder form and combined 
with the dry powder drug by blending. The powder carriers will usually be crystalline (to 
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avoid water absorption), but might in some cases be amorphous or mixtures of crystalline 
and amorphous. The size of the carrier particles may be selected to improve the flowability of 
the drug powder, typically being in the range from 25 um to 100 um. A preferred carrier 
material is crystalline lactose having a size in the above-stated range. 

Powders prepared by any of the above methods will be collected from the spray dryer 
in a conventional manner for subsequent use. For use as pharmaceuticals and other purposes, 
it will frequently be desirable to disrupt any agglomerates which may have formed by 
screening or other conventional techniques. For pharmaceutical uses, the dry powder 
formulations will usually be measured into a single dose, and the single dose sealed into a 
package. Such packages are particularly useful for dispersion in dry powder inhalers, as 
described in detail below. Alternatively, the powders may be packaged in multiple-dose 
containers. 

Methods for spray drying hydrophobic and other drugs and components are described 
in U.S. Pat. Nos. 5,000,888; 5,026,550; 4,670,419, 4,540,602; and 4,486,435. Bloch and 
Speison (1983) Pharm. Acta Helv 58:14-22 teaches spray drying of hydrochlorothiazide and 
chlorthalidone (lipophilic drugs) and a hydrophilic adjuvant (pentaerythritol) in azeotropic 
solvents of dioxane-water and 2-ethoxyethanol-water. A number of Japanese Patent 
application Abstracts relate to spray drying of hydrophilic-hydrophobic product 
combinations, including JP 806766; JP 7242568; JP 7101884; JP 7101883; JP 71018982; JP 
7101881; and JP 4036233. Other foreign patent publications relevant to spray drying 
hydrophilic-hydrophobic product combinations include FR 2594693; DE 2209477; and 
WO 88/07870. 

LYOPHILIZATION . 

An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be professed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) preparation can be made by lyophilization. Lyophilization is a freeze- 
drying process in which water is sublimed from the composition after it is frozen. The 
particular advantage associated with the lyophilization process is that biologicals and 
pharmaceuticals that are relatively unstable in an aqueous solution can be dried without 
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elevated temperatures (thereby eliminating the adverse thermal effects), and then stored in a 
dry state where there are few stability problems. With respect to the instant invention such 
techniques are particularly compatible with the incorporation of nucleic acids in perforated 
microstructures without compromising physiological activity. Methods for providing 

5 lyophilized particulates are known to those of skill in the art and it would clearly not require 
undue experimentation to provide dispersion compatible microstructures in accordance with 
the teachings herein. Accordingly, to the extent that lyophilization processes may be used to 
provide microstructures having the desired porosity and size, they are conformance with the 
teachings herein and are expressly contemplated as being within the scope of the instant 

10 invention. 

Targeting 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNAs. It should be understood, however, that 
these formulations, compositions and methods can be practiced with other iRNA agents, e.g., 

1 5 modified iRNA agents, and such practice is within the invention. 

In some embodiments, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA 
agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA 
agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or 
sRNA agent, or precursor thereof) is targeted to a particular cell. For example, a liposome 

20 or particle or other structure that includes a iRNA can also include a targeting moiety that 
recognizes a specific molecule on a target cell. The targeting moiety can be a molecule with 
a specific affinity for a target cell. Targeting moieties can include antibodies directed against 
a protein found on the surface of a target cell, or the ligand or a receptor-binding portion of a 
ligand for a molecule found on the surface of a target cell. For example, the targeting moiety 

25 can recognize a cancer-specific antigen (e.g., CA15-3, CA19-9, CEA, or HER2/neu.) or a 
viral antigen, thus delivering the iRNA to a cancer cell or a virus-infected cell. Exemplary 
targeting moieties include antibodies (such as IgM, IgG, IgA, IgD, and the like, or a 
functional portions thereof), ligands for cell surface receptors (e.g., ectodomains thereof). 
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Table 3 provides a number of antigens which can be used to target selected cells. 
Table 3. 



In one embodiment, the targeting moiety is attached to a liposome. For example, US 
6,245,427 describes a method for targeting a liposome using a protein or peptide. In another 
example, a cationic lipid component of the liposome is derivatized with a targeting moiety. 
For example, WO 96/37194 describes converting N-glutaryldioleoylphosphatidyl 
ethanolamine to a N-hydroxysuccinimide activated ester. The product was then coupled to 
an RGD peptide. 

GENES AND DISEASES 

In one aspect, the invention features, a method of treating a subject at risk for or 
afflicted with unwanted cell proliferation, e.g., malignant or nonmalignant cell proliferation. 
The method includes: 

providing an iRNA agent, e.g., an sRNA or iRNA agent described herein, e.g., an 
iRNA having a structure described herein, where the iRNA is homologous to and can silence, 
e.g., by cleavage, a gene which promotes unwanted cell proliferation; 

administering an iRNA agent, e.g., an sRNA or iRNA agent described herein to a 
subject, preferably a human subject, 

thereby treating the subject. 



ANTIGEN 



Exemplary tumor tissue 



CEA (carcinoembryonic antigen) 

PSA (prostate specific antigen) 

CA-125 

CA15-3 

CA 19-9 

HER2/neu 

a-feto protein 

P-HCG (human chorionic gonadotropin) 
MUC-1 

Estrogen receptor 

Progesterone receptor 

EGFr (epidermal growth factor receptor) 



colon, breast, lung 



prostate cancer 
ovarian cancer 
breast cancer 
breast cancer 
breast cancer 



testicular cancer, hepatic cancer 
testicular cancer, choriocarcinoma 
breast cancer 

breast cancer, uterine cancer 
breast cancer, uterine cancer 
bladder cancer 
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In a preferred embodiment the gene is a growth factor or growth factor receptor gene, 
a kinase, e.g., a protein tyrosine, serine or threonine kinase gene, an adaptor protein gene, a 
gene encoding a G protein superfamily molecule, or a gene encoding a transcription factor. 

In a preferred embodiment the iRNA agent silences the PDGF beta gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted PDGF 
beta expression, e.g., testicular and lung cancers. 

In another preferred embodiment the iRNA agent silences the Erb-B gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted Erb- 
B expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences the Src gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted Src 
expression, e.g., colon cancers. 

In a preferred embodiment the iRNA agent silences the CRK gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted CRK 
expression, e.g., colon and lung cancers. 

In a preferred embodiment the iRNA agent silences the GRB2 gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted GRB2 
expression, e.g., squamous cell carcinoma. 

In another preferred embodiment the iRNA agent silences the RAS gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted RAS 
expression, e.g., pancreatic, colon and lung cancers, and chronic leukemia. 

In another preferred embodiment the iRNA agent silences the MEKK gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
MEKK expression, e.g., squamous cell carcinoma, melanoma or leukemia. 

In another preferred embodiment the iRNA agent silences the JNK gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted JNK 
expression, e.g., pancreatic or breast cancers. 

In a preferred embodiment the iRNA agent silences the RAF gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted RAF 
expression, e.g., lung cancer or leukemia. 
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In a preferred embodiment the iRNA agent silences the Erkl/2 gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted Erkl/2 
expression, e.g., lung cancer. 

In another preferred embodiment the iRNA agent silences the PCNA(p21) gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
PCNA expression, e.g., lung cancer. 

In a preferred embodiment the iRNA agent silences the MYB gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted MYB 
expression, e.g., colon cancer or chronic myelogenous leukemia. 

In a preferred embodiment the iRNA agent silences the c-MYC gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted c-MYC 
expression, e.g., Burkitt's lymphoma or neuroblastoma. 

In another preferred embodiment the iRNA agent silences the JUN gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted JUN 
expression, e.g., ovarian, prostate or breast cancers. 

In another preferred embodiment the iRNA agent silences the FOS gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted FOS 
expression, e.g., skin or prostate cancers. 

In a preferred embodiment the iRNA agent silences the BCL-2 gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted BCL-2 
expression, e.g., lung or prostate cancers or Non-Hodgkin lymphoma. 

In a preferred embodiment the iRNA agent silences the Cyclin D gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted Cyclin D 
expression, e.g., esophageal and colon cancers. 

In a preferred embodiment the iRNA agent silences the VEGF gene, arid thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted VEGF 
expression, e.g., esophageal and colon cancers. 

In a preferred embodiment the iRNA agent silences the EGFR gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted EGFR 
expression, e.g., breast cancer. 
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In another preferred embodiment the iRNA agent silences the Cyclin A gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
Cyclin A expression, e.g., lung and cervical cancers. 

In another preferred embodiment the iRNA agent silences the Cyclin E gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
Cyclin E expression, e.g., lung and breast cancers. 

In another preferred embodiment the iRNA agent silences the WNT-1 gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
WNT-1 expression, e.g., basal cell carcinoma. 

In another preferred embodiment the iRNA agent silences the beta-catenin gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
beta-catenin expression, e.g., adenocarcinoma or hepatocellular carcinoma. 

In another preferred embodiment the iRNA agent silences the c-MET gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted c- 
MET expression, e.g., hepatocellular carcinoma. 

In another preferred embodiment the iRNA agent silences the PKC gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted PKC 
expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences the NFKB gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted NFKB 
expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences the STAT3 gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted STAT3 
expression, e.g., prostate cancer. 

In another preferred embodiment the iRNA agent silences the survivin gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
survivin expression, e.g., cervical or pancreatic cancers. 

In another preferred embodiment the iRNA agent silences the Her2/Neu gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
Her2/Neu expression, e.g., breast cancer. 
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In another preferred embodiment the iRNA agent silences the topoisomerase I gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted topoisomerase I expression, e.g., ovarian and colon cancers. 

In a preferred embodiment the iRNA agent silences the topoisomerase II alpha gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted topoisomerase II expression, e.g., breast and colon cancers. 

In a preferred embodiment the iRNA agent silences mutations in the p73 gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
p73 expression, e.g., colorectal adenocarcinoma. 

In a preferred embodiment the iRNA agent silences mutations in the 
p21(WAFl/CIPl) gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted p21(WAFl/CIPl) expression, e.g., liver cancer. 

In a preferred embodiment the iRNA agent silences mutations in the p27(KIPl) gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted p27(KIPl) expression, e.g., liver cancer. 

In a preferred embodiment the iRNA agent silences mutations in the PPM1D gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted PPM1D expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences mutations in the RAS gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
RAS expression, e.g., breast cancer. 

In another preferred embodiment the iRNA agent silences mutations in the caveolin I 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted caveolin I expression, e.g., esophageal squamous cell carcinoma. 

In another preferred embodiment the iRNA agent silences mutations in the MIB I 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted MIB I expression, e.g., male breast carcinoma (MBC). 

In another preferred embodiment the iRNA agent silences mutations in the MTAI 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted MTAI expression, e.g., ovarian carcinoma. 
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In another preferred embodiment the iRNA agent silences mutations in the M68 gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted M68 expression, e.g., human adenocarcinomas of the esophagus, stomach, colon, 
and rectum. 

5 In preferred embodiments the iRNA agent silences mutations in tumor suppressor 

genes, and thus can be used as a method to promote apoptotic activity in combination with 
chemotherapeutics. 

In a preferred embodiment the iRNA agent silences mutations in the p53 tumor 
suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
10 characterized by unwanted p53 expression, e.g., gall bladder, pancreatic and lung cancers. 

In a preferred embodiment the iRNA agent silences mutations in the p53 family 
member DN-p63, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted DN-p63 expression, e.g., squamous cell carcinoma 

In a preferred embodiment the iRNA agent silences mutations in the pRb tumor 
1 5 suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted pRb expression, e.g., oral squamous cell carcinoma 

In a preferred embodiment the iRNA agent silences mutations in the APC1 tumor 
suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted APC1 expression, e.g., colon cancer. 
20 In a preferred embodiment the iRNA agent silences mutations in the BRCA1 tumor 

suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted BRCA1 expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences mutations in the PTEN tumor 
suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
25 characterized by unwanted PTEN expression, e.g., hamartomas, gliomas, and prostate and 
endometrial cancers. 

In a preferred embodiment the iRNA agent silences MLL fusion genes, e.g., MLL- 
AF9, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted MLL fusion gene expression, e.g., acute leukemias. 
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In another preferred embodiment the iRNA agent silences the BCR/ABL fusion gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted BCR/ABL fusion gene expression, e.g., acute and chronic leukemias. 

In another preferred embodiment the iRNA agent silences the TEL/AML1 fusion 
5 gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted TEL/AML1 fusion gene expression, e.g., childhood acute leukemia. 

In another preferred embodiment the iRNA agent silences the EWS/FLI1 fusion gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted EWS/FLI1 fusion gene expression, e.g., Ewing Sarcoma. 
1 o In another preferred embodiment the iRNA agent silences the TLS/FUS 1 fusion gene, 

and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted TLS/FUS 1 fusion gene expression, e.g., Myxoid liposarcoma. 

In another preferred embodiment the iRNA agent silences the PAX3/FKHR fusion 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
1 5 unwanted PAX3/FKHR fusion gene expression, e.g., Myxoid liposarcoma. 

In another preferred embodiment the iRNA agent silences the AML1/ETO fusion 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted AML1/ETO fusion gene expression, e.g., acute leukemia. 

In another aspect, the invention features, a method of treating a subject, e.g., a human, 
20 at risk for or afflicted with a disease or disorder that may benefit by angiogenesis inhibition 
e.g., cancer. The method includes: 

providing an iRNA agent, e.g., an iRNA agent having a structure described herein, 
which iRNA agent is homologous to and can silence, e.g., by cleavage, a gene which 
mediates angiogenesis; 
25 administering the iRNA agent to a subject, 

thereby treating the subject. 

In a preferred embodiment the iRNA agent silences the alpha v-integrin gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
alpha V integrin, e.g., brain tumors or tumors of epithelial origin. 
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In a preferred embodiment the iRNA agent silences the Flt-1 receptor gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted Flt-1 
receptors, eg. Cancer and rheumatoid arthritis. 

In a preferred embodiment the iRNA agent silences the tubulin gene, and thus can be 
5 used to treat a subject having or at risk for a disorder characterized by unwanted tubulin, eg. 
Cancer and retinal neovascularization. 

In a preferred embodiment the iRNA agent silences the tubulin gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted tubulin, eg. 
Cancer and retinal neovascularization. 
1 o In another aspect, the invention features a method of treating a subject infected with a 

virus or at risk for or afflicted with a disorder or disease associated with a viral infection. 
The method includes: 

providing an iRNA agent, e.g., and iRNA agent having a structure described herein, 
which iRNA agent is homologous to and can silence, e.g., by cleavage, a viral gene of a 
15 cellular gene which mediates viral function, e.g., entry or growth; 

administering the iRNA agent to a subject, preferably a human subject, 

thereby treating the subject. 

Thus, the invention provides for a method of treating patients infected by the Human 
Papilloma Virus (HPV) or at risk for or afflicted with a disorder mediated by HPV, e.g, 
20 cervical cancer. HPV is linked to 95% of cervical carcinomas and thus an antiviral therapy is 
an attractive method to treat these cancers and other symptoms of viral infection. 

In a preferred embodiment, the expression of a HPV gene is reduced. In another 
preferred embodiment, the HPV gene is one of the group of E2, E6, or E7. 

In a preferred embodiment the expression of a human gene that is required for HPV 
25 replication is reduced. 

The invention also includes a method of treating patients infected by the Human 
Immunodeficiency Virus (HIV) or at risk for or afflicted with a disorder mediated by HIV, 
e.g., Acquired Immune Deficiency Syndrome (AIDS). 

In a preferred embodiment, the expression of a HIV gene is reduced. In another 
30 preferred embodiment, the HIV gene is CCR5, Gag, or Rev. 
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In a preferred embodiment the expression of a human gene that is required for HIV 
replication is reduced. In another preferred embodiment, the gene is CD4 or TsglOl. 

The invention also includes a method for treating patients infected by the Hepatitis B 
Virus (HBV) or at risk for or afflicted with a disorder mediated by HBV, e.g., cirrhosis and 
5 heptocellular carcinoma. 

In a preferred embodiment, the expression of a HBV gene is reduced. In another 
preferred embodiment, the targeted HBV gene encodes one of the group of the tail region of 
the HBV core protein, the pre-cregious (pre-c) region, or the cregious (c) region. In another 
preferred embodiment, a targeted HBV-RNA sequence is comprised of the poly(A) tail. 
10 In preferred embodiment the expression of a human gene that is required for HBV 

replication is reduced. 

The invention also provides for a method of treating patients infected by the Hepatitis 
A Virus (HAV), or at risk for or afflicted with a disorder mediated by HAV. 

In a preferred embodiment the expression of a human gene that is required for HAV 
15 replication is reduced. 

The present invention provides for a method of treating patients infected by the 
Hepatitis C Virus (HCV), or at risk for or afflicted with a disorder mediated by HCV, e.g., 
cirrhosis 

In a preferred embodiment, the expression of a HCV gene is reduced. 
20 In another preferred embodiment the expression of a human gene that is required for 

HCV replication is reduced. 

The present invention also provides for a method of treating patients infected by the 
any of the group of Hepatitis Viral strains comprising hepatitis D, E, F, G, or H, or patients at 
risk for or afflicted with a disorder mediated by any of these strains of hepatitis. 
25 In a preferred embodiment, the expression of a Hepatitis, D, E, F, G, or H gene is 

reduced. 

In another preferred embodiment the expression of a human gene that is required for 
hepatitis D, E, F, G or H replication is reduced. 

Methods of the invention also provide for treating patients infected by the 
30 Respiratory Syncytial Virus (RSV) or at risk for or afflicted with a disorder mediated by 
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RSV, e.g, lower respiratory tract infection in infants and childhood asthma, pneumonia and 
other complications, e.g., in the elderly. 

In a preferred embodiment, the expression of a RSV gene is reduced. In another 
preferred embodiment, the targeted HBV gene encodes one of the group of genes N, L, or P. 

In a preferred embodiment the expression of a human gene that is required for RSV 
replication is reduced. 

Methods of the invention provide for treating patients infected by the Herpes 
Simplex Virus (HSV) or at risk for or afflicted with a disorder mediated by HSV, e.g, genital 
herpes and cold sores as well as life-threatening or sight-impairing disease mainly in 
immunocompromised patients. 

In a preferred embodiment, the expression of a HSV gene is reduced. In another 
preferred embodiment, the targeted HSV gene encodes DNA polymerase or the helicase- 
primase. 

In a preferred embodiment the expression of a human gene that is required for HSV 
replication is reduced. 

The invention also provides a method for treating patients infected by the herpes 
Cytomegalovirus (CMV) or at risk for or afflicted with a disorder mediated by CMV, e.g., 
congenital virus infections and morbidity in immunocompromised patients. 

In a preferred embodiment, the expression of a CMV gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for CMV 
replication is reduced. 

Methods of the invention also provide for a method of treating patients infected by 
the herpes Epstein Barr Virus (EBV) or at risk for or afflicted with a disorder mediated by 
EBV, e.g., NK/T-cell lymphoma, non-Hodgkin lymphoma, and Hodgkin disease. 

In a preferred embodiment, the expression of a EBV gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for EBV 
replication is reduced. 

Methods of the invention also provide for treating patients infected by Kaposi's 
Sarcoma-associated Herpes Virus (KSHV), also called human herpesvirus 8, or patients at 
risk for or afflicted with a disorder mediated by KSHV, e.g., Kaposi's sarcoma, multicentric 
Castleman's disease and AIDS-associated primary effusion lymphoma. 
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In a preferred embodiment, the expression of a KSHV gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for KSHV 
replication is reduced. 

The invention also includes a method for treating patients infected by the JC Virus 
5 (JCV) or a disease or disorder associated with this virus, e.g., progressive multifocal 
leukoencephalopathy (PML). 

In a preferred embodiment, the expression of a JCV gene is reduced. 

In preferred embodiment the expression of a human gene that is required for JCV 
replication is reduced. 

10 Methods of the invention also provide for treating patients infected by the myxo virus 

or at risk for or afflicted with a disorder mediated by myxovirus, e.g., influenza. 

In a preferred embodiment, the expression of a myxovirus gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for 
myxovirus replication is reduced. 
15 Methods of the invention also provide for treating patients infected by the rhinovirus 

or at risk for of afflicted with a disorder mediated by rhinovirus, e.g., the common cold. 

In a preferred embodiment, the expression of a rhinovirus gene is reduced. 

In preferred embodiment the expression of a human gene that is required for 
rhinovirus replication is reduced. 
20 Methods of the invention also provide for treating patients infected by the coronavirus 

or at risk for of afflicted with a disorder mediated by coronavirus, e.g., the common cold. 

In a preferred embodiment, the expression of a coronavirus gene is reduced. 

In preferred embodiment the expression of a human gene that is required for 
coronavirus replication is reduced. 
25 Methods of the invention also provide for treating patients infected by the flavivirus 

West Nile or at risk for or afflicted with a disorder mediated by West Nile Virus. 

In a preferred embodiment, the expression of a West Nile Virus gene is reduced. In 
another preferred embodiment, the West Nile Virus gene is one of the group comprising E, 
NS3, orNS5. 

30 In a preferred embodiment the expression of a human gene that is required for West 

Nile Virus replication is reduced. 
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Methods of the invention also provide for treating patients infected by the St. Louis 
Encephalitis flavivirus, or at risk for or afflicted with a disease or disorder associated with 
this virus, e.g., viral haemorrhagic fever or neurological disease. 

In a preferred embodiment, the expression of a St. Louis Encephalitis gene is reduced. 
5 In a preferred embodiment the expression of a human gene that is required for St. 

Louis Encephalitis virus replication is reduced. 

Methods of the invention also provide for treating patients infected by the Tick-borne 
encephalitis flavivirus, or at risk for or afflicted with a disorder mediated by Tick-borne 
encephalitis virus, e.g., viral haemorrhagic fever and neurological disease. 
10 In a preferred embodiment, the expression of a Tick-borne encephalitis virus gene is 

reduced. 

In a preferred embodiment the expression of a human gene that is required for Tick- 
borne encephalitis virus replication is reduced. 

Methods of the invention also provide for methods of treating patients infected by the 
1 5 Murray Valley encephalitis flavivirus, which commonly results in viral haemorrhagic fever 
and neurological disease. 

In a preferred embodiment, the expression of a Murray Valley encephalitis virus gene 
is reduced. 

In a preferred embodiment the expression of a human gene that is required for Murray 
20 Valley encephalitis virus replication is reduced. 

The invention also includes methods for treating patients infected by the dengue 
flavivirus, or a disease or disorder associated with this virus, e.g., dengue haemorrhagic 
fever. 

In a preferred embodiment, the expression of a dengue virus gene is reduced. 
25 In a preferred embodiment the expression of a human gene that is required for dengue 

virus replication is reduced. 

Methods of the invention also provide for treating patients infected by the Simian 
Virus 40 (SV40) or at risk for or afflicted with a disorder mediated by SV40, e.g., 
tumorigenesis. 

30 In a preferred embodiment, the expression of a SV40 gene is reduced. 
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In a preferred embodiment the expression of a human gene that is required for SV40 
replication is reduced. 

The invention also includes methods for treating patients infected by the Human T 
Cell Lymphotropic Virus (HTLV), or a disease or disorder associated with this virus, e.g., 
5 leukemia and myelopathy. 

In a preferred embodiment, the expression of a HTLV gene is reduced. In another 
preferred embodiment the HTLV1 gene is the Tax transcriptional activator. 

In a preferred embodiment the expression of a human gene that is required for HTLV 
replication is reduced. 

10 Methods of the invention also provide for treating patients infected by the Moloney- 

Murine Leukemia Virus (Mo-MuLV) or at risk for or afflicted with a disorder mediated by 
Mo-MuLV, e.g., T-cell leukemia. 

In a preferred embodiment, the expression of a Mo-MuLV gene is reduced. 
In a preferred embodiment the expression of a human gene that is required for Mo- 
15 MuLV replication is reduced. 

Methods of the invention also provide for treating patients infected by the 
encephalomyocarditis virus (EMCV) or at risk for or afflicted with a disorder mediated by 
EMCV, e.g. myocarditis. EMCV leads to myocarditis in mice and pigs and is capable of 
infecting human myocardial cells. This virus is therefore a concern for patients undergoing 
20 xenotransplantation. 

In a preferred embodiment, the expression of a EMCV gene is reduced. 
In a preferred embodiment the expression of a human gene that is required for EMCV 
replication is reduced. 

The invention also includes a method for treating patients infected by the measles 
25 virus (MV) or at risk for or afflicted with a disorder mediated by MV, e.g. measles. 
In a preferred embodiment, the expression of a MV gene is reduced, 
ha a preferred embodiment the expression of a human gene that is required for MV 
replication is reduced. 

The invention also includes a method for treating patients infected by the Vericella 
30 zoster virus (VZV) or at risk for or afflicted with a disorder mediated by VZV, e.g. chicken 
pox or shingles (also called zoster). 
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In a preferred embodiment, the expression of a VZV gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for VZV 
replication is reduced. 

The invention also includes a method for treating patients infected by an adenovirus 
or at risk for or afflicted with a disorder mediated by an adenovirus, e.g. respiratory tract 
infection. 

In a preferred embodiment, the expression of an adenovirus gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for 
adenovirus replication is reduced. 

The invention includes a method for treating patients infected by a yellow fever virus 
(YFV) or at risk for or afflicted with a disorder mediated by a YFV, e.g. respiratory tract 
infection. 

In a preferred embodiment, the expression of a YFV gene is reduced. In another 
preferred embodiment, the preferred gene is one of a group that includes the E, NS2A, or 
NS3 genes. 

In a preferred embodiment the expression of a human gene that is required for YFV 
replication is reduced. 

Methods of the invention also provide for treating patients infected by the poliovirus 
or at risk for or afflicted with a disorder mediated by poliovirus, e.g., polio. 

In a preferred embodiment, the expression of a poliovirus gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for 
poliovirus replication is reduced. 

Methods of the invention also provide for treating patients infected by a poxvirus or 
at risk for or afflicted with a disorder mediated by a poxvirus, e.g., smallpox 

In a preferred embodiment, the expression of a poxvirus gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for 
poxvirus replication is reduced. 

In another, aspect the invention features methods of treating a subject infected with a 
pathogen, e.g., a bacterial, amoebic, parasitic, or fungal pathogen. The method includes: 

providing a iRNA agent, e.g., a siRNA having a structure described herein, where 
siRNA is homologous to and can silence, e.g., by cleavage of a pathogen gene; 
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administering the iRNA agent to a subject, prefereably a human subject, 
thereby treating the subject. 

The target gene can be one involved in growth, cell wall synthesis, protein synthesis, 
transcription, energy metabolism, e.g., the Krebs cycle, or toxin production. 

Thus, the present invention provides for a method of treating patients infected by a 
Plasmodium that causes malaria. 

In a preferred embodiment, the expression of a Plasmodium gene is reduced. In 
another preferred embodiment, the gene is apical membrane antigen 1 (AMA1). 

In a preferred embodiment the expression of a human gene that is required for 
Plasmodium replication is reduced. 

The invention also includes methods for treating patients infected by the 
Mycobacterium ulcerans, or a disease or disorder associated with this pathogen, e.g. Buruli 
ulcers. 

In a preferred embodiment, the expression of a Mycobacterium ulcerans gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Mycobacterium ulcerans replication is reduced. 

The invention also includes methods for treating patients infected by the 
Mycobacterium tuberculosis, or a disease or disorder associated with this pathogen, e.g. 
tuberculosis. 

In a preferred embodiment, the expression of a Mycobacterium tuberculosis gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Mycobacterium tuberculosis replication is reduced. 

The invention also includes methods for treating patients infected by the 
Mycobacterium leprae, or a disease or disorder associated with this pathogen, e.g. leprosy. 

In a preferred embodiment, the expression of a Mycobacterium leprae gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Mycobacterium leprae replication is reduced. 
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The invention also includes methods for treating patients infected by the bacteria 
Staphylococcus aureus, or a disease or disorder associated with this pathogen, e.g. infections 
of the skin and muscous membranes. 

In a preferred embodiment, the expression of a Staphylococcus aureus gene is 
5 reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Staphylococcus aureus replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 
Streptococcus pneumoniae, or a disease or disorder associated with this pathogen, e.g. 
10 pneumonia or childhood lower respiratory tract infection. 

In a preferred embodiment, the expression of a Streptococcus pneumoniae gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Streptococcus pneumoniae replication is reduced. 
1 5 The invention also includes methods for treating patients infected by the bacteria 

Streptococcus pyogenes, or a disease or disorder associated with this pathogen, e.g. Strep 
throat or Scarlet fever. 

In a preferred embodiment, the expression of a Streptococcus pyogenes gene is 
reduced. 

20 In a preferred embodiment the expression of a human gene that is required for 

Streptococcus pyogenes replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 

Chlamydia pneumoniae, or a disease or disorder associated with this pathogen, e.g. 

pneumonia or childhood lower respiratory tract infection 
25 In a preferred embodiment, the expression of a Chlamydia pneumoniae gene is 

reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Chlamydia pneumoniae replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 
30 Mycoplasma pneumoniae, or a disease or disorder associated with this pathogen, e.g. 
pneumonia or childhood lower respiratory tract infection 
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In a preferred embodiment, the expression of a Mycoplasma pneumoniae gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Mycoplasma pneumoniae replication is reduced. 

In one aspect, the invention features, a method of treating a subject, e.g., a human, at 
risk for or afflicted with a disease or disorder characterized by an unwanted immune 
response, e.g., an inflammatory disease or disorder, or an autoimmune disease or disorder. 
The method includes: 

providing an iRNA agent, e.g., an iRNA agent having a structure described herein, 
which iRNA agent is homologous to and can silence, e.g., by cleavage, a gene which 
mediates an unwanted immune response; 

administering the iRNA agent to a subject, 

thereby treating the subject. 

In a preferred embodiment the disease or disorder is an ischemia or reperfusion 
injury, e.g., ischemia or reperfusion injury associated with acute myocardial infarction, 
unstable angina, cardiopulmonary bypass, surgical intervention e.g., angioplasty, e.g., 
percutaneous transluminal coronary angioplasty, the response to a transplantated organ or 
tissue, e.g., transplanted cardiac or vascular tissue; or thrombolysis. 

In a preferred embodiment the disease or disorder is restenosis, e.g., restenosis 
associated with surgical intervention e.g., angioplasty, e.g., percutaneous transluminal 
coronary angioplasty. 

In a prefered embodiment the disease or disorder is Inflammatory Bowel Disease, 
e.g., Crohn Disease or Ulcerative Colitis. 

In a prefered embodiment the disease or disorder is inflammation associated with an 
infection or injury. 

In a prefered embodiment the disease or disorder is asthma, lupus, multiple sclerosis, 
diabetes, e.g., type II diabetes, arthritis, e.g., rheumatoid or psoriatic. 

In particularly preferred embodiments the iRNA agent silences an integrin or co- 
ligand thereof, e.g., VLA4, VCAM, ICAM. 

In particularly preferred embodiments the iRNA agent silences a selectin or co-ligand 
thereof, e.g., P-selectin, E-selectin (ELAM), I-selectin, P-selectin glycoprotein- 1 (PSGL-1). 
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In particularly preferred embodiments the iRNA agent silences a component of the 
complement system, e.g., C3, C5, C3aR, C5aR, C3 convertase, C5 convertase. 

In particularly preferred embodiments the iRNA agent silences a chemokine or 
receptor thereof, e.g., TNFI, TNFJ, IL-1I, IL-1J, IL-2, IL-2R, IL-4, IL-4R, IL-5, IL-6, IL-8, 
5 TNFRI, TNFRII, IgE, SCYA1 1 , CCR3 . 

In other embodiments the iRNA agent silences GCSF, Grol, Gro2, Gro3, PF4, MIG, 
Pro-Platelet Basic Protein (PPBP), MIP-1I, MIP-1J, RANTES, MCP-1, MCP-2, MCP-3, 
CMBKR1, CMBKR2, CMBKR3, CMBKR5, AIF-1, 1-309. 

In one aspect, the invention features, a method of treating a subject, e.g., a human, at 
10 risk for or afflicted with acute pain or chronic pain. The method includes: 

providing an iRNA agent, which iRNA is homologous to and can silence, e.g., by 
cleavage, a gene which mediates the processing of pain; 

administering the iRNA to a subject, 

thereby treating the subject. 
1 5 In particularly preferred embodiments the iRNA agent silences a component of an ion 

channel. 

In particularly preferred embodiments the iRNA agent silences a neurotransmitter 
receptor or ligand. 

In one aspect, the invention features, a method of treating a subject, e.g., a human, at 
20 risk for or afflicted with a neurological disease or disorder. The method includes: 

providing an iRNA agent which iRNA is homologous to and can silence, e.g., by 
cleavage, a gene which mediates a neurological disease or disorder; 
administering the to a subject, 
thereby treating the subject. 
25 In a prefered embodiment the disease or disorder is Alzheimer Disease or Parkinson 

Disease. 

In particularly preferred embodiments the iRNA agent silences an amyloid-family 
gene, e.g., APP; a presenilin gene, e.g., PSEN1 and PSEN2, or I-synuclein. 

In a preferred embodiment the disease or disorder is a neurodegenerative trinucleotide 
30 repeat disorder, e.g., Huntington disease, dentatorubral pallidoluysian atrophy or a 

spinocerebellar ataxia, e.g., SCA1, SCA2, SCA3 (Machado- Joseph disease), SCA7 or SCA8. 
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In particularly preferred embodiments the iRNA agent silences HD, DRPLA, SCA1, SCA2, 
MJD1, CACNL1A4, SCA7, SCA8. 

The loss of heterozygosity (LOH) can result in hemizygosity for sequence, e.g., 
genes, in the area of LOH. This can result in a significant genetic difference between normal 
5 and disease-state cells, e.g., cancer cells, and provides a useful difference between normal 
and disease-state cells, e.g., cancer cells. This difference can arise because a gene or other 
sequence is heterozygous in euploid cells but is hemizygous in cells having LOH. The 
regions of LOH will often include a gene, the loss of which promotes unwanted proliferation, 
e.g., a tumor suppressor gene, and other sequences including, e.g., other genes, in some cases 

10 a gene which is essential for normal function, e.g., growth. Methods of the invention rely, in 
part, on the specific cleavage or silencing of one allele of an essential gene with an iRNA 
agent of the invention. The iRNA agent is selected such that it targets the single allele of the 
essential gene found in the cells having LOH but does not silence the other allele, which is . 
present in cells which do not show LOH. In essence, it discriminates between the two 

15 alleles, preferentially silencing the selected allele. In essence polymorphisms, e.g., SNPs of 
essential genes that are affected by LOH, are used as a target for a disorder characterized by 
cells having LOH, e.g., cancer cells having LOH. 

E.g., one of ordinary skill in the art can identify essential genes which are in 
proximity to tumor suppressor genes, and which are within a LOH region which includes the 

20 tumor suppressor gene. The gene encoding the large subunit of human RNA polymerase II, 
POLR2A, a gene located in close proximity to the tumor suppressor gene p53, is such a gene. 
It frequently occurs within a region of LOH in cancer cells. Other genes that occur within 
LOH regions and are lost in many cancer cell types include the group comprising replication 
protein A 70-kDa subunit, replication protein A 32-kD, ribonucleotide reductase, thymidilate 

25 synthase, TATA associated factor 2H, ribosomal protein S 14, eukaryotic initiation factor 5A, 
alanyl tRNA synthetase, cysteinyl tRNA synthetase, NaK ATPase, alpha-1 subunit, and 
transferrin receptor. 

Accordingly, the invention features, a method of treating a disorder characterized by 
LOH, e.g., cancer. The method includes: 
30 optionally, determining the genotype of the allele of a gene in the region of LOH and 

preferably determining the genotype of both alleles of the gene in a normal cell; 
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providing an iRNA agent which preferentially cleaves or silences the allele found in 
the LOH cells; 

adrrdrusterning the iRNA to the subject, 

thereby treating the disorder. 
5 The invention also includes a iRNA agent disclosed herein, e.g, an iRNA agent which 

can preferentially silence, e.g., cleave, one allele of a polymorphic gene 

In another aspect, the invention provides a method of cleaving or silencing more than 
one gene with an iRNA agent. In these embodiments the iRNA agent is selected so that it 
has sufficient homology to a sequence found in more than one gene. For example, the 
1 o sequence AAGCTGGCCCTGGACATGGAGAT (SEQ ID NO:6736) is conserved between 
mouse lamin Bl, lamin B2, keratin complex 2-gene 1 and lamin A/C. Thus an iRNA agent 
targeted to this sequence would effectively silence the entire collection of genes. 

The invention also includes an iRNA agent disclosed herein, which can silence more 
than one gene. 

15 ROUTE OF DELIVERY 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. A composition that 
20 includes a iRNA can be delivered to a subject by a variety of routes. Exemplary routes 
include: intravenous, topical, rectal, anal, vaginal, nasal, pulmonary, ocular. 

The iRNA molecules of the invention can be incorporated into pharmaceutical 
compositions suitable for administration. Such compositions typically include one or more 
species of iRNA and a pharmaceutically acceptable carrier. As used herein the language 
25 "pharmaceutically acceptable carrier" is intended to include any and all solvents, dispersion 
media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, 
and the like, compatible with pharmaceutical administration. The use of such media and 
agents for pharmaceutically active substances is well knoAvn in the art. Except insofar as any 
conventional media or agent is incompatible with the active compound, use thereof in the 



209 



WO 2004/080406 



PCT/US2004/007070 



compositions is contemplated. Supplementary active compounds can also be incorporated 

into the compositions. 

The pharmaceutical compositions of the present invention may be administered in a 

number of ways depending upon whether local or systemic treatment is desired and upon the 
5 area to be treated. Administration may be topical (including ophthalmic, vaginal, rectal, 

intranasal, transdermal), oral or parenteral. Parenteral administation includes intravenous 

drip, subcutaneous, intraperitoneal or intramuscular injection, or intrathecal or 

intraventricular administration. 

The route and site of administration may be chosen to enhance targeting. For 
1 o example, to target muscle cells, intramuscular injection into the muscles of interest would be 

a logical choice. Lung cells might be targeted by administering the iRNA in aerosol form. 

The vascular endothelial cells could be targeted by coating a balloon catheter with the iRNA 

and mechanically introducing the DNA. 

Formulations for topical administration may include transdermal patches, ointments, 
15 lotions, creams, gels, drops, suppositories, sprays, liquids and powders. Conventional 

pharmaceutical carriers, aqueous, powder or oily bases, thickeners and the like may be 

necessary or desirable. Coated condoms, gloves and the like may also be useful. 

Compositions for oral administration include powders or granules, suspensions or 

solutions in water, syrups, elixirs or non-aqueous media, tablets, capsules, lozenges, or 
20 troches. In the case of tablets, carriers that can be used include lactose, sodium citrate and 

salts of phosphoric acid. Various disintegrants such as starch, and lubricating agents such as 

magnesium stearate, sodium lauryl sulfate and talc, are commonly used in tablets. For oral 

administration in capsule form, useful diluents are lactose and high molecular weight 

polyethylene glycols. When aqueous suspensions are required for oral use, the nucleic acid 
25 compositions can be combined with emulsifying and suspending agents. If desired, certain 

sweetening and/or flavoring agents can be added. 

Compositions for intrathecal or mtraventricular administration may include sterile 

aqueous solutions which may also contain buffers, diluents and other suitable additives. 
Formulations for parenteral administration may include sterile aqueous solutions 
30 which may also contain buffers, diluents and other suitable additives. Intraventricular 

injection may be facilitated by an intraventricular catheter, for example, attached to a 
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reservoir. For intravenous use, the total concentration of solutes should be controlled to 
render the preparation isotonic. 

For ocular adroinistration, ointments or droppable liquids may be delivered by ocular 
delivery systems known to the art such as applicators or eye droppers. Such compositions can 
include mucomimetics such as hyaluronic acid, chondroitin sulfate, hydroxypropyl 
methylcellulose or poly(vinyl alcohol), preservatives such as sorbic acid, EDTA or 
benzylchronium chloride, and the usual quantities of diluents and/or carriers. 

Topical Delivery 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. In a preferred 
embodiment, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, {e.g., a 
precursor, e.g., a. larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) is delivered to a subject via topical administration. "Topical 
administration" refers to the delivery to a subject by contacting the formulation directly to a 
surface of the subject. The most common form of topical delivery is to the skin, but a 
composition disclosed herein can also be directly applied to other surfaces of the body, e.g., 
to the eye, a mucous membrane, to surfaces of a body cavity or to an internal surface. As 
mentioned above, the most common topical delivery is to the skin. The term encompasses 
several routes of administration including, but not limited to, topical and transdermal. These 
modes of administration typically include penetration of the skin's permeability barrier and 
efficient delivery to the target tissue or stratum. Topical administration can be used as a 
means to penetrate the epidermis and dermis and ultimately achieve systemic delivery of the 
composition. Topical administration can also be used as a means to selectively deliver 
oligonucleotides to the epidermis or dermis of a subject, or to specific strata thereof, or to an 
underlying tissue. 

The term "skin," as used herein, refers to the epidermis and/or dermis of an animal. 
Mammalian skin consists of two major, distinct layers. The outer layer of the skin is called 
the epidermis. The epidermis is comprised of the stratum corneum, the stratum granulosum, 
211 



WO 2004/080406 



PCT/US2004/007070 



the stratum spinosum, and the stratum basale, with the stratum corneum being at the surface 
of the skin and the stratum basale being the deepest portion of the epidermis. The epidermis 
is between 50 urn and 0.2 mm thick, depending on its location on the body. 

Beneath the epidermis is the dermis, which is significantly thicker than the epidermis. 
5 The dermis is primarily composed of collagen in the form of fibrous bundles. The 

collagenous bundles provide support for, inter alia, blood vessels, lymph capillaries, glands, 
nerve endings and immunologically active cells. 

One of the major functions of the skin as an organ is to regulate the entry of 
substances into the body. The principal permeability barrier of the skin is provided by the 
10 stratum corneum, which is formed from many layers of cells in various states of 

differentiation. The spaces between cells in the stratum corneum is filled with different 
lipids arranged in lattice-like formations that provide seals to further enhance the skins 
permeability barrier. 

The permeability barrier provided by the skin is such that it is largely impermeable to 
15 molecules having molecular weight greater than about 750 Da. For larger molecules to cross 
the skin's permeability barrier, mechanisms other than normal osmosis must be used. 

Several factors determine the permeability of the skin to administered agents. These 
factors include the characteristics of the treated skin, the characteristics of the delivery agent, 
interactions between both the drug and delivery agent and the drug and skin, the dosage of 
20 the drug applied, the form of treatment, and the post treatment regimen. To selectively target 
the epidermis and dermis, it is sometimes possible to formulate a composition that comprises 
one or more penetration enhancers that will enable penetration of the drug to a preselected 
stratum. 

Transdermal delivery is a valuable route for the administration of lipid soluble 
25 therapeutics. The dermis is more permeable than the epidermis and therefore absorption is 
much more rapid through abraded, burned or denuded skin. Inflammation and other 
physiologic conditions that increase blood flow to the skin also enhance transdermal 
adsorption. Absorption via this route may be enhanced by the use of an oily vehicle 
(inunction) or through the use of one or more penetration enhancers. Other effective ways to 
30 deliver a composition disclosed herein via the transdermal route include hydration of the skin 
and the use of controlled release topical patches. The transdermal route provides a 
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potentially effective means to deliver a composition disclosed herein for systemic and/or 
local therapy. 

In addition, iontophoresis (transfer of ionic solutes through biological membranes 
under the influence of an electric field) (Lee et ah, Critical Reviews in Therapeutic Drug 

5 Carrier Systems, 1 99 1 , p. 1 63), phonophoresis or sonophoresis (use of ultrasound to enhance 
the absorption of various therapeutic agents across biological membranes, notably the skin 
and the cornea) (Lee et ah, Critical Reviews in Therapeutic Drug Carrier Systems, 1991, p. 
166), and optimization of vehicle characteristics relative to dose position and retention at the 
site of administration (Lee et ah, Critical Reviews in Therapeutic Drug Carrier Systems, 

10 1991, p. 168) may be useful methods for enhancing the transport of topically applied 
compositions across skin and mucosal sites. 

The compositions and methods provided may also be used to examine the function of 
various proteins and genes in vitro in cultured or preserved dermal tissues and in animals. 
The invention can be thus applied to examine the function of any gene. The methods of the 

1 5 invention can also be used therapeutically or prophylactically. For example, for the 

treatment of animals that are known or suspected to suffer from diseases such as psoriasis, 
lichen planus, toxic epidermal necrolysis, ertythema multiforme, basal cell carcinoma, 
squamous cell carcinoma, malignant melanoma, Paget's disease, Kaposi's sarcoma, 
pulmonary fibrosis, Lyme disease and viral, fungal and bacterial infections of the skin. 

20 

Pulmonary Delivery 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 

25 e.g., modified iRNA agents, and such practice is within the invention. A composition that 
includes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) can be administered to a subject by pulmonary delivery. Pulmonary 

30 delivery compositions can be delivered by inhalation by the patient of a dispersion so that the 
composition, preferably iRNA, within the dispersion can reach the lung where it can be 
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readily absorbed through the alveolar region directly into blood circulation. Pulmonary 
delivery can be effective both for systemic delivery and for localized delivery to treat 
diseases of the lungs. 

Pulmonary delivery can be achieved by different approaches, including the use of 
nebulized, aerosolized, micellular and dry powder-based formulations. Delivery can be 
achieved with liquid nebulizers, aerosol-based inhalers, and dry powder dispersion devices. 
Metered-dose devices are preferred. One of the benefits of using an atomizer or inhaler is 
that the potential for contamination is minimized because the devices are self contained. Dry 
powder dispersion devices, for example, deliver drugs that may be readily formulated as dry 
powders. A iRNA composition may be stably stored as lyophilized or spray-dried powders 
by itself or in combination with suitable powder carriers. The delivery of a composition for 
inhalation can be mediated by a dosing timing element which can include a timer, a dose 
counter, time measuring device, or a time indicator which when incorporated into the device 
enables dose tracking, compliance monitoring, and/or dose triggering to a patient during 
administration of the aerosol medicament. 

The term "powder" means a composition that consists of finely dispersed solid 
particles that are free flowing and capable of being readily dispersed in an inhalation device 
and subsequently inhaled by a subject so that the particles reach the lungs to permit 
penetration into the alveoli. Thus, the powder is said to be "respirable." Preferably the 
average particle size is less than about 10 um in diameter preferably with a relatively uniform 
spheroidal shape distribution. More preferably the diameter is less than about 7.5 um and 
most preferably less than about 5.0 um. Usually the particle size distribution is between 
about 0.1 um and about 5 um in diameter, particularly about 0.3 um to about 5 um. 

The term "dry" means that the composition has a moisture content below about 10% 
by weight (% w) water, usually below about 5% w and preferably less it than about 3% w. A 
dry composition can be such that the particles are readily dispersible in an inhalation device 
to form an aerosol. 

The term "therapeutically effective amount" is the amount present in the composition 
that is needed to provide the desired level of drug in the subject to be treated to give the 
anticipated physiological response. 
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The term "physiologically effective amount 54 is that amount delivered to a subject to 
give the desired palliative or curative effect. 

The term "pharmaceutically acceptable carrier" means that the carrier can be taken 
into the lungs with no significant adverse toxicological effects on the lungs. 
5 The types of pharmaceutical excipients that are useful as carrier include stabilizers 

such as human serum albumin (HSA), bulking agents such as carbohydrates, amino acids and 
polypeptides; pH adjusters or buffers; salts such as sodium chloride; and the like. These 
carriers may be in a crystalline or amorphous form or may be a mixture of the two. 

Bulking agents that are particularly valuable include compatible carbohydrates, 
10 polypeptides, amino acids or combinations thereof. Suitable carbohydrates include 

monosaccharides such as galactose, D-mannose, sorbose, and the like; disaccharides, such as 
lactose, trehalose, and the like; cyclodextrins, such as 2-hydroxypropyl-.beta.-cyclodextrin; 
and polysaccharides, such as raffinose, maltodextrins, dextrans, and the like; alditols, such as 
mannitol, xylitol, and the like. A preferred group of carbohydrates includes lactose, 
15 threhalose, raffinose maltodextrins, and mannitol. Suitable polypeptides include aspartame. 
Amino acids include alanine and glycine, with glycine being preferred. 

Additives, which are minor components of the composition of this invention, may be 
included for conformational stability during spray drying and for improving dispersibility of 
the powder. These additives include hydrophobic amino acids such as tryptophan, tyrosine, 
20 leucine, phenylalanine, and the like. 

Suitable pH adjusters or buffers include organic salts prepared from organic acids and 
bases, such as sodium citrate, sodium ascorbate, and the like; sodium citrate is preferred. 

Pulmonary administration of a miceilar iRNA formulation may be achieved through 
metered dose spray devices with propellants such as tetrafluoroethane, heptafluoroethane, 
25 dimethylfluoropropane, tetrafluoropropane, butane, isobutane, dimethyl ether and other non- 
CFC and CFC propellants. 

Oral or Nasal Delivery 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
30 that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. Both the oral and 
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nasal membranes offer advantages over other routes of administration. For example, drugs 
administered through these membranes have a rapid onset of action, provide therapeutic 
plasma levels, avoid first pass effect of hepatic metabolism, and avoid exposure of the drug 
to the hostile gastrointestinal (GI) environment. Additional advantages include easy access 

5 to the membrane sites so that the drug can be applied, localized and removed easily. 

In oral delivery, compositions can be targeted to a surface of the oral cavity, e.g., to 
sublingual mucosa which includes the membrane of ventral surface of the tongue and the 
floor of the mouth or the buccal mucosa which constitutes the lining of the cheek. The 
sublingual mucosa is relatively permeable thus giving rapid absorption and acceptable 

10 bioavailability of many drugs. Further, the sublingual mucosa is convenient, acceptable and 
easily accessible. 

The ability of molecules to permeate through the oral mucosa appears to be related to 
molecular size, lipid solubility and peptide protein ionization. Small molecules, less than 
1000 daltons appear to cross mucosa rapidly. As molecular size increases, the permeability 

15 decreases rapidly. Lipid soluble compounds are more permeable than non-lipid soluble 
molecules. Maximum absorption occurs when molecules are un-ionized or neutral in 
electrical charges. Therefore charged molecules present the biggest challenges to absorption 
through the oral mucosae. 

A pharmaceutical composition of iRNA may also be administered to the buccal cavity 

20 of a human being by spraying into the cavity, without inhalation, from a metered dose spray 
dispenser, a mixed micellar pharmaceutical formulation as described above and a propellant. 
In one embodiment, the dispenser is first shaken prior to spraying the pharmaceutical 
formulation and propellant into the buccal cavity. 

Devices 

25 For ease of exposition the devices, formulations, compositions and methods in this 

section are discussed largely with regard to unmodified iRNA agents. It should be 
understood, however, that these devices, formulations, compositions and methods can be 
practiced with other iRNA agents, e.g., modified iRNA agents, and such practice is within the 
invention. An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 

30 precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 

which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
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precursor thereof) can be disposed on or in a device, e.g., a device which implanted or 
otherwise placed in a subject. Exemplary devices include devices which are introduced into 
the vasculature, e.g., devices inserted into the lumen of a vascular tissue, or which devices 
themselves form a part of the vasculature, including stents, catheters, heart valves, and other 
5 vascular devices. These devices, e.g., catheters or stents, can be placed in the vasculature of 
the lung, heart, or leg. 

Other devices include non-vascular devices, e.g., devices implanted in the 
peritoneum, or in organ or glandular tissue, e.g., artificial organs. The device can release a 
therapeutic substance in addition to a iRNA, e.g., a. device can release insulin. 
1 o Other devices include artificial joints, e.g., hip joints, and other orthopedic implants. 

In one embodiment, unit doses or measured doses of a composition that includes 
iRNA are dispensed by an implanted device. The device can include a sensor that monitors a 
parameter within a subject. For example, the device can include pump, e.g., and, optionally, 
associated electronics. 

1 5 Tissue, e.g., cells or organs can be treated with An iRNA agent, e.g., a double- 

stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) ( ex vivo and then administered or 
implanted in a subject. 

20 The tissue can be autologous, allogeneic, or xenogeneic tissue. E.g., tissue can be 

treated to reduce graft v. host disease. In other embodiments, the tissue is allogeneic and the 
tissue is treated to treat a disorder characterized by unwanted gene expression in that tissue. 
E.g., tissue, e.g., hematopoietic cells, e.g., bone marrow hematopoietic cells, can be treated to 
inhibit unwanted cell proliferation. 

25 Introduction of treated tissue, whether autologous or transplant, can be combined with 

other therapies. 

In some implementations, the iRNA treated cells are insulated from other cells, e.g., 
by a semi-permeable porous barrier that prevents the cells from leaving the implant, but 
enables molecules from the body to reach the cells and molecules produced by the cells to 
30 enter the body. In one embodiment, the porous barrier is formed from alginate. 



217 



WO 2004/080406 



PCT/US2004/007070 



In one embodiment, a contraceptive device is coated with or contains an iRNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a larger iRNA 
agent which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof). Exemplary 
5 devices include condoms, diaphragms, IUD (implantable uterine devices, sponges, vaginal 
sheaths, and birth control devices. In one embodiment, the iRNA is chosen to inactive sperm 
or egg. In another embodiment, the iRNA is chosen to be complementary to a viral or 
pathogen UNA, e.g., an RNA of an STD. In some instances, the iRNA composition can 
include a spermicide. 

10 DOSAGE 

In one aspect, the invention features a method of administering an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, to a subject (e.g., a human subject). The 
method includes administering a unit dose of the iRNA agent, e.g., a sRNA agent, e.g., 
double stranded sRNA agent that (a) the double-stranded part is 19-25 nucleotides (nt) long, 

15 preferably 21-23 nt, (b) is complementary to a target RNA (e.g., an endogenous or pathogen 
target RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nucleotide long. In 
one embodiment, the unit dose is less than 1.4 mg per kg of body weight, or less than 10, 5, 2, 
1, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, 0.00005 or 0.00001 mg per kg of 
bodyweight, and less than 200 mnole of RNA agent (e.g. about 4.4 x 10 16 copies) per kg of 

20 bodyweight, or less than 1500, 750, 300, 150, 75, 15, 7.5, 1.5, 0.75, 0.15, 0.075, 0.015, 
0.0075, 0.0015, 0.00075, 0.00015 nmole of RNA agent per kg of bodyweight. 

The defined amount can be an amount effective to treat or prevent a disease or 
disorder, e.g., a disease or disorder associated with the target RNA. The unit dose, for 
example, can be administered by injection (e.g., intravenous or intramuscular), an inhaled 

25 dose, or a topical application. Particularly preferred dosages are less than 2, 1, or 0.1 mg/kg 
of body weight. 

In a preferred embodiment, the unit dose is administered less frequently than once a 
day, e.g., less than every 2, 4, 8 or 30 days. In another embodiment, the unit dose is not 
administered with a frequency (e.g., not a regular frequency). For example, the unit dose 
30 may be administered a single time. 
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In one embodiment, the effective dose is administered with other traditional 
therapeutic modalities. In one embodiment, the subject has a viral infection and the modality 
is an antiviral agent other than an iRNA agent, e.g., other than a double-stranded iRNA 
agent, or sRNA agent,. In another embodiment, the subject has atherosclerosis and the 
effective dose of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, is 
administered in combination with, e.g., after surgical intervention, e.g., angioplasty. 

In one embodiment, a subject is administered an initial dose and one or more 
maintenance doses of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof). The maintenance dose or doses are generally lower than the initial dose, 
e.g., one-half less of the initial dose. A maintenance regimen can include treating the subject 
with a dose or doses ranging from 0.01 ug to 1.4 mg/kg of body weight per day, e.g., 10, 1, 
0.1, 0.01, 0.001, or 0.00001 mg per kg of bodyweight per day. The maintenance doses are 
preferably administered no more than once every 5, 10, or 30 days. Further, the treatment 
regimen may last for a period of time which will vary depending upon the nature of the 
particular disease, its severity and the overall condition of the patient. In preferred 
embodiments the dosage may be delivered no more than once per day, e.g., no more than 
once per 24, 36, 48, or more hours, e.g., no more than once for every 5 or 8 days. Following 
treatment, the patient can be monitored for changes in his condition and for alleviation of the 
symptoms of the disease state. The dosage of the compound may either be increased in the 
event the patient does not respond significantly to current dosage levels, or the dose may be 
decreased if an alleviation of the symptoms of the disease state is observed, if the disease 
state has been ablated, or if undesired side-effects are observed. 

The effective dose can be administered in a single dose or in two or more doses, as 
desired or considered appropriate under the specific circumstances. If desired to facilitate 
repeated or frequent infusions, implantation of a delivery device, e.g., a pump, semi- 
permanent stent (e.g., intravenous, intraperitoneal, intracisternal or intracapsular), or 
reservoir may be advisable. 

In one embodiment, the iRNA agent pharmaceutical composition includes a plurality 
of iRNA agent species. In another embodiment, the iRNA agent species has sequences that 
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are non-overlapping and non-adjacent to another species with respect to a naturally occurring 
target sequence. In another embodiment, the plurality of iRNA agent species is specific for 
different naturally occurring target genes. In another embodiment, the iRNA agent is allele 
specific. 

5 In some cases, a patient is treated with a iRNA agent in conjunction with other 

therapeutic modalities. For example, a patient being treated for a viral disease, e.g. an HIV 
associated disease (e.g., AIDS), may be administered a iRNA agent specific for a target gene 
essential to the virus in conjunction with a known antiviral agent (e.g., a protease inhibitor or 
reverse transcriptase inhibitor). In another example, a patient being treated for cancer may be 

10 administered a iRNA agent specific for a target essential for tumor cell proliferation in 
conjunction with a chemotherapy. 

Following successful treatment, it may be desirable to have the patient undergo 
maintenance therapy to prevent the recurrence of the disease state, wherein the compound of 
the invention is administered in maintenance doses, ranging from 0.01 |ig to 100 g per kg of 

15 body weight (see US 6,107,094). 

The concentration of the iRNA agent composition is an amount sufficient to be 
effective in treating or preventing a disorder or to regulate a physiological condition in 
humans. The concentration or amount of iRNA agent administered will depend on the 
parameters determined for the agent and the method of administration, e.g. nasal, buccal, 

20 pulmonary. For example, nasal formulations tend to require much lower concentrations of 
some ingredients in order to avoid irritation or burning of the nasal passages. It is sometimes 
desirable to dilute an oral formulation up to 10-100 times in order to provide a suitable nasal 
formulation. 

Certain factors may influence the dosage required to effectively treat a subject, 
25 including but not limited to the severity of the disease or disorder, previous treatments, the 
general health and/or age of the subject, and other diseases present. Moreover, treatment of a 
subject with a therapeutically effective amount of an iRNA agent, e.g., a double-stranded 
iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be 
processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
30 stranded iRNA agent, or sRNA agent, or precursor thereof) can include a single treatment 
or, preferably, can include a series of treatments. It will also be appreciated that the effective 
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dosage of a iRNA agent such as a sRNA agent used for treatment may increase or decrease 
over the course of a particular treatment. Changes in dosage may result and become apparent 
from the results of diagnostic assays as described herein. For example, the subject can be 
monitored after administering a iRNA agent composition. Based on information from the 
monitoring, an additional amount of the iRNA agent composition can be administered. 

Dosing is dependent on severity and responsiveness of the disease condition to be 
treated, with the course of treatment lasting from several days to several months, or until a 
cure is effected or a diminution of disease state is achieved. Optimal dosing schedules can be 
calculated from measurements of drug accumulation in the body of the patient. Persons of 
ordinary skill can easily determine optimum dosages, dosing methodologies and repetition 
rates. Optimum dosages may vary depending on the relative potency of individual 
compounds, and can generally be estimated based on EC50s found to be effective in in vitro 
and in vivo animal models. In some embodiments, the animal models include transgenic 
animals that express a human gene, e.g. a gene that produces a target RNA. The transgenic 
animal can be deficient for the corresponding endogenous RNA. In another embodiment, the 
composition for testing includes a iRNA agent that is complementary, at least in an internal 
region, to a sequence that is conserved between the target RNA in the animal model and the 
target RNA in a human. 

The inventors have discovered that iRNA agents described herein can be administered 
to mammals, particularly large mammals such as nonhuman primates or humans in a number 
of ways. 

In one embodiment, the administration of the iRNA agent, e.g., a double-stranded 
iRNA agent, or sRNA agent, composition is parenteral, e.g. intravenous (e.g., as a bolus or as 
a diffusible infusion), intradermal, intraperitoneal, intramuscular, intrathecal, intraventricular, 
intracranial, subcutaneous, transmucosal, buccal, sublingual, endoscopic, rectal, oral, vaginal, 
topical, pulmonary, intranasal, urethral or ocular. Administration can be provided by the 
subject or by another person, e.g., a health care provider. The medication can be provided in 
measured doses or in a dispenser which delivers a metered dose. Selected modes of delivery 
are discussed in more detail below. 
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The invention provides methods, compositions, and kits, for rectal administration or 
delivery of iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent , or a 
DNA which encodes a an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
or precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA 
agent described herein, e.g., a iRNA agent having a double stranded region of less than 40, 
and preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3' 
overhangs can be administered rectally, e.g., introduced through the rectum into the lower or 
upper colon. This approach is particularly useful in the treatment of, inflammatory disorders, 
disorders characterized by unwanted cell proliferation, e.g., polyps, or colon cancer. 

The medication can be delivered to a site in the colon by introducing a dispensing 
device, e.g., a flexible, camera-guided device similar to that used for inspection of the colon 
or removal of polyps, which includes means for delivery of the medication. 

The rectal administration of the iRNA agent is by means of an enema. The iRNA 
agent of the enema can be dissolved in a saline or buffered solution. The rectal 
administration can also by means of a suppository, which can include other ingredients, e.g., 
an excipient, e.g., cocoa butter or hydropropylmethylcellulose. 

Any of the iRNA agents described herein can be administered orally, e.g., in the form 
of tablets, capsules, gel capsules, lozenges, troches or liquid syrups. Further, the composition 
can be applied topically to a surface of the oral cavity. 

Any of the iRNA agents described herein can be administered buccally. For example, 
the medication can be sprayed into the buccal cavity or applied directly, e.g., in a liquid, 
solid, or gel form to a surface in the buccal cavity. This administration is particularly 
desirable for the treatment of inflammations of the buccal cavity, e.g., the gums or tongue, 
e.g., in one embodiment, the buccal administration is by spraying into the cavity, e.g., 
without inhalation, from a dispenser, e.g., a metered dose spray dispenser that dispenses the 
pharmaceutical composition and a propellant. 

Any of the iRNA agents described herein can be administered to ocular tissue. For 
example, the medications can be applied to the surface of the eye or nearby tissue, e.g., the 
inside of the eyelid. They can be applied topically, e.g., by spraying, in drops, as an 
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eyewash, or an ointment. Administration can be provided by the subject or by another 
person, e.g., a health care provider. The medication can be provided in measured doses or in 
a dispenser which delivers a metered dose. The medication can also be administered to the 
interior of the eye, and can be introduced by a needle or other delivery device which can 
introduce it to a selected area or structure. Ocular treatment is particularly desirable for 
treating inflammation of the eye or nearby tissue. 

Any of the iRNA agents described herein can be administered directly to the skin. 
For example, the medication can be applied topically or delivered in a layer of the skin, e.g., 
by the use of a microneedle or a battery of microneedles which penetrate into the skin, but 
preferably not into the underlying muscle tissue. Administration of the iRNA agent 
composition can be topical. Topical applications can, for example, deliver the composition 
to the dermis or epidermis of a subject. Topical administration can be in the form of 
transdermal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids or 
powders. A composition for topical administration can be formulated as a liposome, micelle, 
emulsion, or other lipophilic molecular assembly. The transdermal administration can be 
applied with at least one penetration enhancer, such as iontophoresis, phonophoresis, and 
sonophoresis. 

Any of the iRNA agents described herein can be administered to the pulmonary 
system. Pulmonary administration can be achieved by inhalation or by the introduction of a 
delivery device into the pulmonary system, e.g., by introducing a delivery device which can 
dispense the medication. A preferred method of pulmonary delivery is by inhalation. The 
medication can be provided in a dispenser which delivers the medication, e.g., wet or dry, in 
a form sufficiently small such that it can be inhaled. The device can deliver a metered dose 
of medication. The subject, or another person, can administer the medication. 

Pulmonary delivery is effective not only for disorders which directly affect 
pulmonary tissue, but also for disorders which affect other tissue. 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or 
aerosol for pulmonary delivery. 

Any of the iRNA agents described herein can be administered nasally. Nasal 
administration can be achieved by introduction of a delivery device into the nose, e.g., by 
introducing a delivery device which can dispense the medication. Methods of nasal delivery 
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include spray, aerosol, liquid, e.g., by drops, or by topical administration to a surface of the 
nasal cavity. The medication can be provided in a dispenser with delivery of the medication, 
e.g., wet or dry, in a form sufficiently small such that it can be inhaled. The device can 
deliver a metered dose of medication. The subject, or another person, can administer the 
medication. 

Nasal delivery is effective not only for disorders which directly affect nasal tissue, but 
also for disorders which affect other tissue 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or for 
nasal delivery. 

An iRNA agent can be packaged in a viral natural capsid or in a chemically or 
enzymatically produced artificial capsid or structure derived therefrom. 

The dosage of a pharmaceutical composition including a iRNA agent can be 
administered in order to alleviate the symptoms of a disease state, e.g., cancer or a 
cardiovascular disease. A subject can be treated with the pharmaceutical composition by any 
of the methods mentioned above. 

Gene expression in a subject can be modulated by administering a pharmaceutical 
composition including an iRNA agent. 

A subject can be treated by administering a defined amount of an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent) composition that is in a powdered form, e.g., a 
collection of microparticles, such as crystalline particles. The composition can include a 
plurality of iRNA agents, e.g., specific for one or more different endogenous target RNAs. 
The method can include other features described herein. 

A subject can be treated by administering a defined amount of an iRNA agent 
' composition that is prepared by a method that includes spray-drying, i.e. atomizing a liquid 
solution, emulsion, or suspension, immediately exposing the droplets to a drying gas, and 
collecting the resulting porous powder particles. The composition can include a plurality of 
iRNA agents, e.g., specific for one or more different endogenous target RNAs. The method 
can include other features described herein. 

The iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
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which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof), can be provided in a powdered, crystallized or other finely divided form, 
with or without a carrier, e.g., a micro- or nano-particle suitable for inhalation or other 
pulmonary delivery. This can include providing an aerosol preparation, e.g., an aerosolized 

5 spray-dried composition. The aerosol composition can be provided in and/or dispensed by a 
metered dose delivery device. 

The subject can be treated for a condition treatable by inhalation, e.g., by aerosolizing 
a spray-dried iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 

10 which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 

precursor thereof) composition and inhaling the aerosolized composition. The iRNA agent 
can be an sRNA. The composition can include a plurality of iRNA agents, e.g., specific for 
one or more different endogenous target RNAs. The method can include other features 
described herein. 

15 A subject can be treated by, for example, administering a composition including an 

effective/defined amount of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA 
agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA 
agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or 
sRNA agent, or precursor thereof), wherein the composition is prepared by a method that 

20 includes spray-drying, lyophilization, vacuum drying, evaporation, fluid bed drying, or a 
combination of these techniques 

In another aspect, the invention features a method that includes: evaluating a 
parameter related to the abundance of a transcript in a cell of a subject; comparing the 
evaluated parameter to a reference value; and if the evaluated parameter has a preselected 

25 relationship to the reference value (e.g., it is greater), administering a iRNA agent (or a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes a iRNA agent or precursor thereof) to the subject. In one embodiment, the 
iRNA agent includes a sequence that is complementary to the evaluated transcript. For 
example, the parameter can be a direct measure of transcript levels, a measure of a protein 

30 level, a disease or disorder symptom or characterization (e.g., rate of cell proliferation and/or 
tumor mass, viral load,) 
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In another aspect, the invention features a method that includes: administering a first 
amount of a composition that comprises an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, or precursor thereof) to a subject, wherein the iRNA agent includes a strand 
substantially complementary to a target nucleic acid; evaluating an activity associated with a 
protein encoded by the target nucleic acid; wherein the evaluation is used to determine if a 
second amount should be administered. In a preferred embodiment the method includes 
administering a second amount of the composition, wherein the timing of administration or 
dosage of the second amount is a function of the evaluating. The method can include other 
features described herein. 

In another aspect, the invention features a method of administering a source of a 
double-stranded iRNA agent (ds iRNA agent) to a subject. The method includes 
administering or implanting a source of a ds iRNA agent, e.g., a sRNA agent, that (a) 
includes a double-stranded region that is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to a target RNA (e.g., an endogenous RNA or a pathogen 
RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nt long. In one embodiment, 
the source releases ds iRNA agent over time, e.g. the source is a controlled or a slow release 
source, e.g., a microparticle that gradually releases the ds iRNA agent. In another 
embodiment, the source is a pump, e.g., a pump that includes a sensor or a pump that can 
release one or more unit doses. 

In one aspect, the invention features a pharmaceutical composition that includes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) 
including a nucleotide sequence complementary to a target RNA, e.g., substantially and/or 
exactly complementary. The target RNA can be a transcript of an endogenous human gene. 
In one embodiment, the iRNA agent (a) is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to an endogenous target RNA, and, optionally, (c) includes 
at least one 3' overhang 1-5 nt long. In one embodiment, the pharmaceutical composition can 
be an emulsion, microemulsion, cream, jelly, or liposome. 
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In one example the pharmaceutical composition includes an iRNA agent mixed with a 
topical delivery agent. The topical delivery agent can be a plurality of microscopic vesicles. 
The microscopic vesicles can be liposomes. In a preferred embodiment the liposomes are 
cationic liposomes. 

5 In another aspect, the pharmaceutical composition includes an iRNA agent, e.g., a 

double-stranded iRNA agent, or sRNA agent (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof) admixed with a topical 
penetration enhancer. In one embodiment, the topical penetration enhancer is a fatty acid. 

10 The fatty acid can be arachidonic acid, oleic acid, lauric acid, caprylic acid, capric acid, 
myristic acid, palmitic acid, stearic acid, linoleic acid, linolenic acid, dicaprate, tricaprate, 
monolein, dilaurin, glyceryl 1-monocaprate, l-dodecylazacycloheptan-2-one, an 
acylcarnitine, an acylcholine, or a Ci.io alkyl ester, monoglyceride, diglyceride or 
pharmaceutically acceptable salt thereof. 

15 In another embodiment, the topical penetration enhancer is a bile salt. The bile salt 

can be cholic acid, dehydrocholic acid, deoxycholic acid, glucholic acid, glycholic acid, 
glycodeoxycholic acid, taurocholic acid, taurodeoxycholic acid, chenodeoxycholic acid, 
ursodeoxycholic acid, sodium tauro-24,25-dihydro-fusidate, sodium glycodihydrofusidate, 
polyoxyethylene-9-lauryl ether or a pharmaceutically acceptable salt thereof. 

20 In another embodiment, the penetration enhancer is a chelating agent. The chelating 

agent can be EDTA, citric acid, a salicyclate, aN-acyl derivative of collagen, laureth-9, an 
N-amino acyl derivative of a beta-diketone or a mixture thereof. 

In another embodiment, the penetration enhancer is a surfactant, e.g., an ionic or 
nonionic surfactant. The surfactant can be sodium lauryl sulfate, polyoxyethylene-9-lauryl 

25 ether, polyoxyethylene-20-cetyl ether, a perfluorchemical emulsion or mixture thereof. 

In another embodiment, the penetration enhancer can be selected from a group 
consisting of unsaturated cyclic ureas, 1-alkyl-alkones, 1 -alkenylazacyclo-alakanones, 
steroidal anti-inflammatory agents and mixtures thereof. In yet another embodiment the 
penetration enhancer can be a glycol, a pyrrol, an azone, or a terpenes. 

30 In one aspect, the invention features a pharmaceutical composition including an 

iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
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larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
form suitable for oral delivery. In one embodiment, oral delivery can be used to deliver an 
iRNA agent composition to a cell or a region of the gastro-intestinal tract, e.g., small 

5 intestine, colon (e.g., to treat a colon cancer), and so forth. The oral delivery form can be 
tablets, capsules or gel capsules. In one embodiment, the iRNA agent of the pharmaceutical 
composition modulates expression of a cellular adhesion protein, modulates a rate of cellular 
proliferation, or has biological activity against eukaryotic pathogens or retroviruses. In 
another embodiment, the pharmaceutical composition includes an enteric material that 

10 substantially prevents dissolution of the tablets, capsules or gel capsules in a mammalian 
stomach. In a preferred embodiment the enteric material is a coating. The coating can be 
acetate phthalate, propylene glycol, sorbitan monoleate, cellulose acetate trimellitate, 
hydroxy propyl methylcellulose phthalate or cellulose acetate phthalate. 

In another embodiment, the oral dosage form of the pharmaceutical composition 

15 includes a penetration enhancer. The penetration enhancer can be a bile salt or a fatty acid. 
The bile salt can be ursodeoxycholic acid, chenodeoxycholic acid, and salts thereof. The 
fatty acid can be capric acid, lauric acid, and salts thereof. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycol. In another 

20 example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or triethyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 

25 iRNA agent and a delivery vehicle. In one embodiment, the iRNA agent is (a) is 19-25 

nucleotides long, preferably 21-23 nucleotides, (b) is complementary to an endogenous target 
RNA, and, optionally, (c) includes at least one 3' overhang 1-5 nucleotides long. 

In one embodiment, the delivery vehicle can deliver an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 

30 be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) to a cell by a topical route of 
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administration. The delivery vehicle can be microscopic vesicles. In one example the 
microscopic vesicles are liposomes. In a preferred embodiment the liposomes are cationic 
liposomes. In another example the microscopic vesicles are micelles.In one aspect, the 
invention features a pharmaceutical composition including an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) in an injectable dosage form. In 
one embodiment, the injectable dosage form of the pharmaceutical composition includes 
sterile aqueous solutions or dispersions and sterile powders. In a preferred embodiment the 
sterile solution can include a diluent such as water; saline solution; fixed oils, polyethylene 
glycols, glycerin, or propylene glycol. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in 
oral dosage form. In one embodiment, the oral dosage form is selected from the group 
consisting of tablets, capsules and gel capsules. In another embodiment, the pharmaceutical 
composition includes an enteric material that substantially prevents dissolution of the tablets, 
capsules or gel capsules in a mammalian stomach. In a preferred embodiment the enteric 
material is a coating. The coating can be acetate phthalate, propylene glycol, sorbitan 
monoleate, cellulose acetate trimellitate, hydroxy propyl methyl cellulose phthalate or 
cellulose acetate phthalate. In one embodiment, the oral dosage form of the pharmaceutical 
composition includes a penetration enhancer, e.g., a penetration enhancer described herein. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycol. In another 
example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or triethyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
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larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
rectal dosage form. In one embodiment, the rectal dosage form is an enema. In another 
embodiment, the rectal dosage form is a suppository. 
5 In one aspect, the invention features a pharmaceutical composition including an 

iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
vaginal dosage form. In one embodiment, the vaginal dosage form is a suppository. In 

10 another embodiment, the vaginal dosage form is a foam, cream, or gel. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 

15 pulmonary or nasal dosage form. In one embodiment, the iRNA agent is incorporated into a 
particle, e.g., a macroparticle, e.g., a microsphere. The particle can be produced by spray 
drying, lyophilization, evaporation, fluid bed drying, vacuum drying, or a combination 
thereof. The microsphere can be formulated as a suspension, a powder, or an implantable 
solid. 

20 In one aspect, the invention features a spray-dried iRNA agent, e.g., a double- 

stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) composition suitable for 
inhalation by a subject, including: (a) a therapeutically effective amount of a iRNA agent 

25 suitable for treating a condition in the subject by inhalation; (b) a pharmaceutically 

acceptable excipient selected from the group consisting of carbohydrates and amino acids; 
and (c) optionally, a dispersibility-enhancing amount of a physiologically-acceptable, water- 
soluble polypeptide. 

In one embodiment, the excipient is a carbohydrate. The carbohydrate can be 

30 selected from the group consisting of monosaccharides, disaccharides, trisaccharides, and 
polysaccharides. In a preferred embodiment the carbohydrate is a monosaccharide selected 
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from the group consisting of dextrose, galactose, mannitol, D-mannose, sorbitol, and sorbose. 
In another preferred embodiment the carbohydrate is a disaccharide selected from the group 
consisting of lactose, maltose, sucrose, and trehalose. 

In another embodiment, the excipient is an amino acid. In one embodiment, the 
5 amino acid is a hydrophobic amino acid. In a preferred embodiment the hydrophobic amino 
acid is selected from the group consisting of alanine, isoleucine, leucine, methionine, 
phenylalanine, proline, tryptophan, and valine. In yet another embodiment the amino acid is a 
polar amino acid. In a preferred embodiment the amino acid is selected from the group 
consisting of arginine, histidine, lysine, cysteine, glycine, glutamine, serine, threonine, 
10 tyrosine, aspartic acid and glutamic acid. 

In one embodiment, the dispersibility-enhancing polypeptide is selected from the 
group consisting of human serum albumin, a-lactalbumin, trypsinogen, and polyalanine. 

In one embodiment, the spray-dried iRNA agent composition includes particles 
having a mass median diameter (MMD) of less than 1 0 microns. In another embodiment, 
1 5 the spray-dried iRNA agent composition includes particles having a mass median diameter of 
less than 5 microns. In yet another embodiment the spray-dried iRNA agent composition 
includes particles having a mass median aerodynamic diameter (MMAD) of less than 5 
microns. 

In certain other aspects, the invention provides kits that include a suitable container 
20 containing a pharmaceutical formulation of an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof). In certain embodiments the individual 
components of the pharmaceutical formulation may be provided in one container. 
25 Alternatively, it may be desirable to provide the components of the pharmaceutical 

formulation separately in two or more containers, e.g., one container for an iRNA agent 
preparation, and at least another for a carrier compound. The kit may be packaged in a 
number of different configurations such as one or more containers in a single box. The 
different components can be combined, e.g., according to instructions provided with the kit. 
30 The components can be combined according to a method described herein, e.g., to prepare 
and administer a pharmaceutical composition. The kit can also include a delivery device. 
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In another aspect, the invention features a device, e.g., an implantable device, wherein 
the device can dispense or administer a composition that includes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof), e.g., a iRNA agent that 
silences an endogenous transcript. In one embodiment, the device is coated with the 
composition. In another embodiment the iRNA agent is disposed within the device. In 
another embodiment, the device includes a mechanism to dispense a unit dose of the 
composition. In other embodiments the device releases the composition continuously, e.g., 
by diffusion. Exemplary devices include stents, catheters, pumps, artificial organs or organ 
components (e.g., artificial heart, a heart valve, etc.), and sutures. 

As used herein, the term "crystalline" describes a solid having the structure or 
characteristics of a crystal, i.e., particles of three-dimensional structure in which the plane 
faces intersect at definite angles and in which there is a regular internal structure. The 
compositions of the invention may have different crystalline forms. Crystalline forms can be 
prepared by a variety of methods, including, for example, spray drying. 

The invention is further illustrated by the following examples, which should not be 
construed as further limiting. 

EXAMPLES 

Example 1: Inhibition of endogenous ApoM gene expression in mice 

Apolipoprotein M (ApoM) is a human apolipoprotein predominantly present in high- 
density lipoprotein (HDL) in plasma. ApoM is reported to be expressed exclusively in liver 
and in kidney (Xu N et al, Biochem J Biol Chem 1999 Oct 29;274(44):3 1286-90). Mouse 
ApoM is a 21kD membrane associated protein, and, in serum, the protein is associated with 
HDL particles. ApoM gene expression is regulated by the transcription factor hepatocyte 
nuclear factor 1 alpha (Hnf-la), as Hnf-la" 7 " mice are ApoM deficient. In humans, mutations 
in the HNF-1 alpha gene represent a common cause of maturity-onset diabetes of the young 
(MODY). 
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A variety of test iKNAs were synthesized to target the mouse ApoM gene. This gene 
was chosen in part because of its high expression levels and exclusive activity in the liver and 
kidney. 

Three different classes of dsRNA agents were synthesized, each class having different 
modifications and features at the 5' and 3' ends, see Table 4. 
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Table 4 

Targeted ORE" s 

5 The23mer: AAGTTTGGGCAGCTCTGCTCT (SEQ ID NO: 6708) 

19 The23mer: AAGTGGACATACCGATTGACT (SEQ ID NO:6709) 

25 The23mer: AACTCAGAACTGAAGGGCGCC (SEQ ID NO:6710) 

27 The23mer: AAGGGCGCCCAGACATGAAAA (SEQ ID NO: 6711) 

3--DTR (beginning at 645) 

42: AAGATAGGAGCCCAGCTTCGA (SEQ ID NO: 6712) 

Class I 

21-nt iRNAs, t, deoxythymidine; p, phosphate 

pGUUUGGGCAGCUCUGCUCUtt (SEQ ID NO: 6712) #1 
pAGAGCAGAGCUGCCCAAACtt (SEQ ID NO: 6713) 

pGUGGACAUACCGAUUGACUtt (SEQ ID NO: 6714) #2 
pAGUCAAUCGGUAUGUCCACtt (SEQ ID NO: 6715) 

pCUCAGAACUGAAGGGCGCCtt (SEQ ID NO:6716) #3 
pGGCGCCCUUCAGUUCUGAGtt (SEQ ID NO: 6717) 

pGAUAGGAGCCCAGCUUCGAtt (SEQ ID NO: 6718) #4 
pUCGAAGCUGGGCUCCUAUCtt (SEQ ID NO: 6719) 
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Class n 

21 -nt iRNAs, t, deoxythymidine; p, phosphate; ps, thiophosphate 

pGtrUUGGGCAGCUCUGOJCpsUpstpst (SEQ ID NO-.6720) #11 
pAGAGCAGAGCUGCCCAAApsCpstpst (SEQ ID NO: 6721) 

pGUGGACAUACCGAUUGACpsUpstpst (SEQ ID NO: 6722) #13 
pAGUCAAUCGGUAUGUCCApsCpstpst (SEQ ID NO: 6723) 

pCUCAGAACUGAAGGGCGCpsCpstpst (SEQ ID NO: 6724) #15 
pGGCGCCCUUCAGUUCUGApsGpstpst (SEQ ID NO: 6725) 

pGAUAGGAGCCCAGCUUCGpsApstpst (SEQ ID NO:6726) #17 
pUCGAAGCUGGGCUCCUAUpsCpstpst (SEQ ID NO: 6727) 

Class m 

23-nt antisense, 21 -nt sense, blunt-ended 5 '-as 

GUUUGGGCAGCUCUGCUCUCU (SEQ ID NO: 6728) #19 
AGAGAGCAGAGCUGCCCAAACUU (SEQ ID NO: 6729) 

GUGGACAUACCGAUUGACUGA (SEQ ID NO: 6730) #21 
UCAGUCAAUCGGUAUGUCCACUU (SEQ ID NO: 6733.) 

CUCAGAACUGAAGGGCGCCCA (SEQ ID NO: 6732) #23 
PUGGGCGCCCUUCAGUUCUGAGUU (SEQ ID NO: 6733) 

GAUAGGAGCCCAGCUUCGAGU (SEQ ID NO: 6734) #25 
ACUCGAAGCUGGGCUCCUAUCUU (SEQ ID NO: 6735) 



Class I dsRNAs consisted of 21 nucleotide paired sense and antisense strands. The 
sense and antisense strands were each phosphorylated at their 5' ends. The double stranded 
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region was 19 nucleotides long and consisted of ribonucleotides. The 3' end of each strand 
created a two nucleotide overhang consisting of two deoxyribonucleotide thymidines. See 
constructs #1-4 in Table 4. 

Class II dsRNAs were also 21 nucleotides long, with a 19 nucleotide double strand 
5 region. The sense and antisense strands were each phosphorylated at their 5' ends. The three 
3' terminal nucleotides of the sense and antisense strands were phosphorothioate 
deoxyribonucleotides, and the two terminal phosphorothioate thymidines were unpaired, 
creating a 3' overhang region at each end of the iRNA molecule. See constructs 1 1, 13, 15, 
and 17 in Table 4. 

1 o Class III dsRNAs included a 23 ribonucleotide antisense strand and a 

21 ribonucleotide sense strand, to form a construct having a blunt 5' and a 3' overhang region. 
See constructs 19, 21, 23, and 25 in Table 4. 

Within each of the three classes of iRNAs, the four dsRNA molecules were designed 
to target four different regions of the ApoM transcript. dsRNAs 1,11, and 19 targeted the 5' 

15 end of the open reading frame (ORF). dsRNAs 2, 13, and 21, and 3, 15, and 23, targeted two 
internal regions (one 5' proximal and one 3' proximal) of the ORF, and the 4, 17, and 25 
iRNA constructs targeted to a region of the 3' untranslated sequence (3' UTS) of the ApoM 
mRNA. This is summarized in Table 5. 

20 Table 5. iRNA molecules targeted to mouse ApoM 





iRNA targeted 
to 5' end of 
ORF 


iRNA targeted 
to middle ORF 
(5' proximal) 


iRNA targeted 
to middle ORF 
(3' proximal) 


iRNA targeted 
to 3 'UTS 


Class I 


1 


2 


3 


4 


Class II 


11 


13 


15 


17 


Class III 


19 


21 


23 


25 



CD1 mice (6-8 weeks old, ~35g) were administered one of the test iRNAs in PBS 
solution. Two hundred micrograms of iRNA in a volume of solution equal to 1 0% body 
weight (~5.7mg iRNA/kg mouse) was administered by the method of high pressure tail vein 
25 injection, over a 1 0-20 sec. time interval. After a 24h recovery period, a second injection 
was performed using the same dose and mode of administration as the first injection, and 
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following another 24h, a third and final injection was administered, also using the same dose 
and mode of administration. After a final 24h recovery, the mouse was sacrificed, serum was 
collected and the liver and kidney harvested to assay for an affect on ApoM gene expression. 
Expression was monitored by quantitative RT-PCR and Western blot analyses. This 
experiment was repeated for each of the iRNAs listed in table 4. 

Class I iRNAs did not alter ApoM RNA levels in mice, as indicated by quantitative 
RT-PCR. This is in contrast to the effect of these iRNAs in cultured HepG2 cells. Cells 
cotransfected with a plasmid expressing exogenous ApoM RNA under a CMV promoter and 
a class I iRNA demonstrated a 25% or greater reduction in ApoM RNA concentrations as 
compared to control transfections. The iRNA molecules 1, 2 and 3 each caused a 75% 
decrease in exogenous ApoM mRNA levels. 

Class II iRNAs reduced liver and kidney ApoM mRNA levels by -30-85%. The iRNA 
molecule "13" elicited the most dramatic reduction in mRNA levels; quantitative RT-PCR 
indicated a decrease of about 85% in liver tissue. Serum ApoM protein levels were also 
reduced as was evidenced by Western blot analysis. The iRNAs 1 1, 13 and 15, reduced 
protein levels by about 50%, while iRNA 17 had the mildest effect, reducing levels only by 
-15-20%. 

Class III iRNAs (constructs 19, 21, and 23) reduced serum Apo levels by -40-50%. 

To determine the effect of dosage on iRNA mediated ApoM inhibition, the 
experiment described above was repeated with three injections of 50ug iRNA "11" 
(-1 .4mg iRNA/kg mouse). This lower dosage of iRNA resulted in a reduction of serum 
ApoM levels of about 50%. This is compared with the reduction seen with the 200ug 
injections, which reduced serum levels by 25-45%. These results indicated the lower 
dosage amounts of iRNAs were effective. 

In an effort to increase iRNA uptake by cells, iRNAs were precomplexed with 
lipofectamine prior to tail vein injections. ApoM protein levels were about 50% of wildtype 
levels in mice injected with iRNA "11" when the molecules were preincubated with 
lipofectamine; ApoM levels were also about 50% of wildtype when mice were injected with 
iRNA "11" that was not precomplexed with lipofectamine. 
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These experiments revealed that modified iRNAs can greatly influence RNAi- 
mediated gene silencing. As demonstrated herein, modifications including phosphorothioate 
nucleotides are particularly effective at decreasing target protein levels. 

Example 2: apoB protein a§ a therapeutic target for lipid-based diseases 

Apolipoprotein B (apoB) is a candidate target gene for the development of novel 

therapies for lipid-based diseases. 

Methods described herein can be used to evaluate the efficacy of a particular siRNA 

as a therapeutic tool for treating lipid metabolism disorders resulting elevated apoB levels. 

Use of siRNA duplexes to selectively bind and inactivate the target apoB mRNA is an 

approach totreat these disorders. 
Two approaches: 

i) Inhibition of apoB in ex-vivo models by transfecting siRNA duplexes homologous 
to human apoB mRNA in a human hepatoma cell line (Hep G2) and monitor the level of the 
protein and the RNA using the Western blotting and RT-PCR methods, respectively. siRNA 
molecules that efficiently inhibit apoB expression will be tested for similar effects in vivo. 

ii) In vivo trials using an apoB transgenic mouse model (apoBlOO Transgenic Mice, 
C57BL/6NTac-TgN (APOB100), Order Model #'s:1004-T (hemizygotes), B6 (control)). 
siRNA duplexes are designed to target apoB-100 or CETP/apoB double transgenic mice 
which express both cholesteryl ester transfer protein (CETP) and apoB. The effect of the 
siRNA on gene expression in vivo can be measured by monitoring the HDL/LDL cholesterol 
level in serum. The results of these experiments would indicate the therapeutic potential of 
siRNAs to treat lipid-based diseases, including hypercholesterolemia, HDL/LDL cholesterol 
imbalance, familial combined hyperlipidemia, and acquired hyperlipidemia. 

Background Fats, in the form of triglycerides, are ideal for energy storage because they are 
highly reduced and anhydrous. An adipocyte (or fat cell) consists of a nucleus, a cell 
membrane, and triglycerides, and its function is to store triglycerides. 

The lipid portion of the human diet consists largely of triglycerides and cholesterol 
(and its esters). These must be emulsified and digested to be absorbed. Specifically, fats 
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(triacylglycerols) are ingested. Bile (bile acids, salts, and cholesterol), which is made in the 
liver, is secreted by the gall bladder. Pancreatic lipase digests the triglycerides to fatty acids, 
and also digests di-, and mono-acylglycerols, which are absorbed by intestinal epithelial cells 
and then are resynthesized into triacylglycerols once inside the cells. These triglycerides and 
5 some cholesterols are combined with apolipoproteins to produce chylomicrons. 

Chylomicrons consist of approximately 95% triglycerides. The chylomicrons transport fatty 
acids to peripheral tissues. Any excess fat is stored in adipose tissue. 

Lipid transport and clearance from the blood into cells, and from the cells into the 
blood and the liver, is mediated by the lipoprotein transport proteins. This class of 

10 approximately 17 proteins can be divided into three groups: Apolipoproteins, lipoprotein 
processing proteins, and lipoprotein receptors. 

Apolipoproteins coat lipoprotein particles, and include the A-I, A-II, A-IV, B, CI, 
CII, CIII, D, E, Apo(a) proteins. Lipoprotein processing proteins include lipoprotein lipase, 
hepatic lipase, lecithin cholesterol acyltransferase and cholesterol ester transfer protein. 

1 5 Lipoprotein receptors include the low density lipoprotein (LDL) receptor, chylomicron- 

remnant receptor (the LDL receptor like protein or LDL receptor related protein - LRP) and 
the scavenger receptor. 

Lipoprotein Metabolism Since the triglycerides, cholesterol esters, and cholesterol absorbed 
20 into the small intestine are not soluble in aqueous medium, they must be combined with 

suitable proteins (apolipoproteins) in order to prevent them from forming large oil droplets. 

The resulting lipoproteins undergo a type of metabolism as they pass through the 

bloodstream and certain organs (notably the liver). 

Also synthesized in the liver is high density lipoprotein (HDL), which contains the 
25 apoproteins A-l, A-2, C-l, and D; HDL collects cholesterol from peripheral tissues and 

blood vessels and returns it to the liver. LDL is taken up by specific cell surface receptors 

into an endosome, which fuses with a lysosome where cholesterol ester is converted to free 

cholesterol. The apoproteins (including apo B-100) are digested to amino acids. The 

receptor protein is recycled to the cell membrane. 
30 The free cholesterol formed by this process has two fates. First, it can move to the 

endoplasmic reticulum (ER), where it can inhibit HMG-CoA reductase, the synthesis of 
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HMG-CoA reductase, and the synthesis of cell surface receptors for LDL. Also in the ER, 
cholesterol can speed up the degradation of HMG-CoA reductase. The free cholesterol can 
also be converted by acyl-CoA and acyl transferase (ACAT) to cholesterol esters, which 
form oil droplets. 

5 ApoB is the major apolipoprotein of chylomicrons of very low density lipoproteins 

(VLDL, which carry most of the plasma triglyceride) and low density lipoprotein (LDL, 
which carry most of the plasma cholesterol). ApoB exists in human plasma in two isoforms, 
apoB-48 and apoB-100. 

ApoB-100 is the major physiological ligand for the LDL receptor. The ApoB 

10 precursor has 4563 amino acids, and the mature apoB-100 has 4536 amino acid residues. The 
LDL-binding domain of ApoB-100 is proposed to be located between residues 3 129 and 
3532. ApoB-100 is synthesized in the liver and is required for the assembly of very low 
density lipoproteins VLDL and for the preparation of apoB-100 to transport triglycerides 
(TG) and cholesterol from the liver to other tissues. ApoB-100 does not interchange between 

15 lipoprotein particles, as do the other lipoproteins, and it is found in DDL and LDL particles. 
After the removal of apolipoproteins A, E and C, apoB is incorporation into VLDL by 
hepatocytes. ApoB-48 is present in chylomicrons and plays an essential role in the intestinal 
absorption of dietary fats. ApoB-48 is synthesized in the small intestine. It comprises the N- 
terminal 48% of apoB-100 and is produced by a posttranscriptional apoB-100 mRNA editing 

20 event at codon 2153 (C to U). This editing event is a product of the apoBEC-lb enzyme, 
which is expressed in the intestine. This editing event creates a stop codon instead of a 
glutamine codon, and therefore apoB-48, instead of apoB-100 is expressed in the intestine 
(apoB-100 is expressed in the liver). 

There is also strong evidence that plasma apoB levels may be a better index of the 

25 risk of coronary artery disease (CAD) than total or LDL cholesterol levels. Clinical studies 
have demonstrated the value of measuring apoB in hypertriglyceridemic, 
hypercholesterolemic and normalipidemic subjects. 
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Table 6. Reference Range Lipid level in the Blood 



Lipid 


Range (mmols/ L) 


Plasma Cholesterol 


3.5-6.5 


Low density lipoprotein 


1.55-4.4 


Very low density lipoprotein 


0.128-0.645 


High density lipoprotein/ triglycerides 


0.5-2.1 


Total lipid 


4.0-10g/L 



Molecular genetics of lipid metabolism in both humans and induced mutant mouse models 
5 Elevated plasma levels of LDL and apoB are associated with a higher risk for atherosclerosis 
and coronary heart disease, a leading cause of mortality. ApoB is the mandatory constituent 
of LDL particles. In addition to its role in lipoprotein metabolism, apoB has also been 
implicated as a factor in male infertility and fetal development. Furthermore, two 
quantitative trait loci regulating plasma apoB levels have been discovered, through the use of 
10 transgenic mouse models. Future experiments will facilitate the identification of human 
orthologous genes encoding regulators of plasma apoB levels. These loci are candidate 
therapeutic targets for human disorders characterized by altered plasma apoB levels. Such 
disorders include non-apoB linked hypobetalipoproteinemia and familial combined 
hyperlipidemia. The identification of these genetic loci would also reveal possible new 
15 pathways involved in the regulation of apoB secretion, potentially providing novel sites for 
pharmacological therapy. 

Diseases and Clinical Pharmacology Familial combined hyperlipemia (FCHL) affects an 
estimated one in 10 Americans. FCHL can cause premature heart disease. 

20 Familial Hypercholesterolemia Qngh level of apo B) A common genetic disorder of lipid 
metabolism. Familial hypercholesterolemia is characterized by elevated serum TC in 
association with xanthelasma, tendon and tuberous xanthomas, accelerated atherosclerosis, 
and early death from myocardial infarction (MI). It is caused by absent or defective LDL 
cell receptors, resulting in delayed LDL clearance, an increase in plasma LDL levels, and an 

25 accumulation of LDL cholesterol in macrophages over joints and pressure points, and in 
blood vessels. 
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Atherosclerosis (high level of apo B) Atherosclerosis develops as a deposition of cholesterol 
and fat in the arterial wall due to disturbances in lipid transport and clearance from the blood 
into cells and from the cells to blood and the liver. 
5 Clinical studies have demonstrated that elevation of total cholesterol (TC), low- 

density lipoprotein cholesterol (LDL-C) and apoB-100 promote human atherosclerosis. 
Similarly, decreased levels of high - density lipoprotein cholesterol (HDL-C) are associated 
with the development of atherosclerosis. 

ApoB may be factor in the genetic cause of high cholesterol. 

10 The risk of coronary artery disease (CAD) (high level of apo B) Cardiovascular disease, 
including coronary heart disease and stroke, is a leading cause of death and disability. The 
major risk factors include age, gender, elevated low-density lipoprotein cholesterol blood 
levels, decreased high-density lipoprotein cholesterol levels, cigarette smoking, hypertension, 
and diabetes. Emerging risk factors include elevated lipoprotein (a), remnant lipoproteins, 

15 and C reactive protein. Dietary intake, physical activity and genetics also impact 
cardiovascular risk. Hypertension and age are the major risk factors for stroke. 

Abetalipoproteinemia, an inherited human disease characterized by a near-complete 
absence of apoB-containing lipoproteins in the plasma, is caused by mutations in the gene for 
microsomal triglyceride transfer protein (MTP). 

20 

Model for human atherosclerosis (Lipoprotein A transgenic mouse) Numerous studies have 
demonstrated that an elevated plasma level of lipoprotein(a) (Lp(a)) is a major independent 
risk factor for coronary heart disease (CHD). Current therapies, however, have little or no 
effect on apo(a) levels and the homology between apo(a) and plasminogen presents barriers 

25 to drug development. Lp(a) particles consist of apo(a) and apoB-100 proteins, and they are 
found only in primates and the hedgehog. The development of LPA transgenic mouse 
requires the creation of animals that express both human apoB and apo(a) transgenes to 
achieve assembly of LP(a). An atherosclerosis mouse model would facilitate the study of 
the disease process and factors influencing it, and further would facilitate the development of 

30 therapeutic or preventive agents. There are several strategies for gene-oriented therapy. For 
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example, the missing or non-functional gene can be replaced, or unwanted gene activity can 
be inhibited. 

Model for lipid Metabolism and Atherosclerosis DNX Transgenic Sciences has 
5 demonstrated that both CETP/ApoB and ApoB transgenic mice develop atherosclerotic 
plaques. 

Model for apoB-100 over expression The apoB-100 transgenic mice express high levels of 
human apoB-100. They consequently demonstrate elevated serum levels of LDL cholesterol. 
1 o After 6 months on a high-fat diet, the mice develop significant foam cell accumulation under 
the endothelium and within the media, as well as cholesterol crystals and fibrotic lesions. 

Model for Cholesteryl ester transfer protein over expression The apoB-100 transgenic mice 
express the human enzyme, CETP, and consequently demonstrate a dramatically reduced 
15 level of serum HDL cholesterol. 

Model for apoB-100 and CETP overexpression The apoB-100 transgenic mice express both 
CETP and apoB-100, resulting in mice with a human like serum HDL/LDL distribution. 
Following 6 months on a high-fat diet these mice develop significant foam cell accumulation 
20 underlying the endothelium and within the media, as well as cholesterol crystals and fibrotic 
lesions. 

ApoB 100 Transgenic Mice (Order Model #'s:1004-T (liemizygotes), B6 (control)) 
These mice express high levels of human apoB-100, resulting in mice with elevated serum 
25 levels of LDL cholesterol. These mice are useful in identifying and evaluating compounds to 
reduce elevated levels of LDL cholesterol and the risk of atherosclerosis. When fed a high 
fat cholesterol diet, these mice develop significant foam cell accumulation underly the 
endothelium and within the media, and have significantly more complex atherosclerotic 
lesions than control animals. 
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Double Transgenic Mice, CETP/ApoBlOO (Order Model #: 1007-TT) These mice express 
both CETP and apoB-100, resulting in a human-like serum HDL/LDL distribution. These 
mice are useful for evaluating compounds to treat hypercholesterolemia or HDL/LDL 
cholesterol imbalance to reduce the risk of developing atherosclerosis. When fed a high fat 
5 high cholesterol diet, these mice develop significant foam cell accumulation underlying the 
endothelium and within the media, and have significantly more complex atherosclerotic 
lesions than control animals. 

ApoE gene Imockout mouse Homozygous apoE knockout mice exhibit strong 
10 hypercholesterolemia, primarily due to elevated levels of VLDL and DDL caused by a defect 
in lipoprotein clearance from plasma. These mice develop atherosclerotic lesions which 
progress with age and resemble human lesions (Zhang et al, Science 258:46-71, 1992; 
Plump et al, Cell 71:343-353, 1992; Nakashima et al, Arterioscler Thromp. 14:133-140, 
1994; Reddick et al, Arterioscler Tromb. 14:141-147, 1994). These mice are a promising 
15 model for studying the effect of diet and drugs on atherosclerosis. 

Low density lipoprotein receptor (LDLR) mediates lipoprotein clearance from plasma 
through die recognition of apoB and apoE on the surface of lipoprotein particles. Humans, 
who lack or have a decreased number of the LDL receptors, have familial 
hypercholesterolemia and develop CHD at an early age. 

20 

ApoE Knockout Mice (Order Model #: APOE-M) The apoE knockout mouse was created by 
gene targeting in embryonic stem cells to disrupt the apoE gene. ApoE, a glycoprotein, is a 
structural component of very low density lipoprotein (VLDL) synthesized by the liver and 
intestinally synthesized chylomicrons. It is also a constituent of a subclass of high density 

25 lipoproteins (HDLs) involved in cholesterol transport activity among cells. One of the most 
important roles of apoE is to mediate high affinity binding of chylomicrons and VLDL 
particles that contain apoE to the low density lipoprotein (LDL) receptor. This allows for the 
specific uptake of these particles by the liver which is necessary for transport preventing the 
accumulation in plasma of cholesterol-rich remnants. The homozygous inactivation of the 

30 apoE gene results in animals that are devoid of apoE in their sera. The mice appear to 
develop normally, but they exhibit five times the normal serum plasma cholesterol and 
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spontaneous atherosclerotic lesions. This is similar to a disease in people who have a variant 
form of the apoE gene that is defective in binding to the LDL receptor and are at risk for 
early development of atherosclerosis and increased plasma triglyceride and cholesterol 
levels. There are indications that apoE is also involved in immune system regulation, nerve 
regeneration and muscle differentiation. The apoE knockout mice can be used to study the 
role of apoE in lipid metabolism, atherogenesis, and nerve injury, and to investigate 
intervention therapies that modify the atherogenic process. 

Apoe4 Targeted Replacement Mouse (Order Model #: 001549-M) ApoE is a plasma protein 
involved in cholesterol transport, and the three human isofonns (E2, E3, and E4) have been 
associated with atherosclerosis and Alzheimer's disease. Gene targeting of 129 ES cells was 
used to replace the coding sequence of mouse apoE with human APOE4 without disturbing 
the murine regulatory sequences. The E4 isoform occurs in approximately 14% of the 
human population and is associated with increased plasma cholesterol and a greater risk of 
coronary artery disease. The Taconic apoE4 Targeted Replacement model has normal 
plasma cholesterol and triglyceride levels, but altered quantities of different plasma 
lipoprotein particles. This model also has delayed plasma clearance of cholesterol-rich 
lipoprotein particles (VLDL), with only half the clearance rate seen in the apoE3 Targeted 
Replacement model. Like the apoE3 model, the apoE4 mice develop altered plasma 
lipoprotein values and atherosclerotic plaques on an atherogenic diet. However, the 
atherosclerosis is more severe in the apoE4 model, with larger plaques and cholesterol apoE 
and apoB-48 levels twice that seen in the apoE3 model. The Taconic apoE4 Targeted 
Replacement model, along with the apoE2 and apoE3 Targeted Replacement Mice, provide 
an excellent tool for in vivo study of the human apoE isoforms. 

CETP Transgenic Mice (Order Model #: 1003-T) These animals express the human plasma 
enzyme, CETP, resulting in mice with a dramatic reduction in serum HDL cholesterol. The 
mice can be useful in identifying and evaluating compounds that increase the levels of HDL 
cholesterol for reducing the risk of developing atherosclerosis 

Transgene/Promoter: human apolipoprotein A-I These mice produce mouse HDL 
cholesterol particles that contain human apolipoprotein A-I. Transgenic expression is life- 
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long in both sexes (Biochemical Genetics and Metabolism Laboratory, Rockefeller 
University, NY City). 

A Mouse Model for Abetalipoproteinemia Abetalipoproteinemia, an inherited human disease 

5 characterized by a near-complete absence of apoB-containing lipoproteins in the plasma, is 
caused by mutations in the gene for microsomal triglyceride transfer protein (MTP). Gene 
targeting was used to knock out the mouse MTP gene (Mttp). In heterozygous knockout 
mice (Mttp +/ ~), the MTP mRNA, protein, and activity levels were reduced by 50% in both 
liver and intestine. Recent studies with heterozygous MTP knockout mice have suggested 

10 that half-normal levels of MTP in the liver reduce apoB secretion. They hypothesized that 
reduced apoB secretion in the setting of half-normal MTP levels might be caused by a 
reduced MTP:apoB ratio in the endoplasmic reticulum, which would reduce the number of 
apoB-MTP interactions. If this hypothesis were true, half-normal levels of MTP might have 
little impact on lipoprotein secretion in the setting of half-normal levels of apoB synthesis 

15 (since the ratio of MTP to apoB would not be abnormally low) and might cause an 

exaggerated reduction in lipoprotein secretion in the setting of apoB overexpression (since 
the ratio of MTP to apoB would be even lower). To test this hypothesis, they examined the 
effects of heterozygous MTP deficiency on apoB metabolism in the setting of normal levels 
of apoB synthesis, half-normal levels of apoB synthesis (heterozygous Apob deficiency), and 

20 increased levels of apoB synthesis (transgenic overexpression of human apoB). Contrary to 
their expectations, half-normal levels of MTP reduced plasma apoB-100 levels to the same 
extent (-25-35%) at each level of apoB synthesis. In addition, apoB secretion from primary 
hepatocytes was reduced to a comparable extent at each level of apoB synthesis. Thus, these 
results indicate that the concentration of MTP within the endoplasmic reticulum, rather than 

25 the MTP:apoB ratio, is the critical determinant of lipoprotein secretion. Finally, 

heterozygosity for an apoB knockout mutation was found to lower plasma apoB-100 levels 
more than heterozygosity for an MTP knockout allele. Consistent with that result, hepatic 
triglyceride accumulation was greater in heterozygous apoB knockout mice than in 
heterozygous MTP knockout mice. Cte/loxP tissue-specific recombination techniques were 

30 also used to generate liver-specific Mttp knockout mice. Inactivation of the Mttp gene in the 
liver caused a striking reduction in very low density lipoprotein (VLDL) triglycerides and 
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large reductions in both VLDL/low density lipoproteins (LDL) and high density lipoprotein 
cholesterol levels. Histologic studies in liver-specific knockout mice revealed moderate 
hepatic steatosis. Currently being tested is the hypothesis that accumulation of triglycerides 
in the liver renders the liver more susceptible to injury by a second insult (e.g., 
5 lipopolysaccharide). 

Human apo B (apolipoprotein B) Transgene mice show apo B locus may have a causative 
role male infertility The fertility of apoB (apolipoprotein B) (+/-) mice was recorded during 
the course of backcrossing (to C57BL/6J mice) and test mating. No apparent fertility 
problem was observed in female apoB (+/-) and wild-type female mice, as was documented 

10 by the presence of vaginal plugs in female mice. Although apoB (+/-) mice mated normally, 
only 40% of the animals from the second backcross generation produced any offspring 
within the 4-month test period. Of the animals that produced progeny, litters resulted from 
< 50% of documented matings. In contrast, all wild-type mice (6/6-/. e., 100%) tested were 
fertile. These data suggest genetic influence on the infertility phenotype, as a small number 

15 of male heterozygotes were not sterile. Fertilization in vivo was dramatically impaired in 
male apoB (+/-) mice. 74% of eggs examined were fertilized by the sperm from wild-type 
mice, whereas only 3% of eggs examined were fertilized by the sperm from apoB (+/-) mice. 
The sperm counts of apoB (+/-) mice were mildly but significantly reduced compared with 
controls. However, the percentage of motile sperm was markedly reduced in the apoB (+/-) 

20 animals compared with that of the wild-type controls. Of the sperm from apoB (+/-) mice, 
20% (i.e., 4.9% of the initial 20% motile sperm) remained motile after 6 hr of incubation, 
whereas 45% (i.e., 33.6% of the initial 69.5%) of the motile sperm retained motility in 
controls after this time. In viti-o fertilization yielded no fertilized eggs in three attempts with 
apo B (+/-) mice, while wild-type controls showed a fertilization rate of 53%. However, 

25 sperm from apoB (+/-) mice fertilized 84% of eggs once the zona pellucida had been 

removed. Numerous sperm from apoB (+/-) mice were seen binding to zona-intact eggs. 
However, these sperm lost their motility when observed 4-6 hours after binding, showing that 
sperm from apoB (+/-) mice were unable to penetrate the zona pellucida but that the 
interaction between sperm and egg was probably not direct. Sperm binding to zona-free 

30 oocytes was abnormal. In the apoB (+/-) mice, sperm binding did not attenuate, even after 
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pronuclei had clearly formed, suggesting that apoB deficiency results in abnormal surface 
interaction between the sperm and egg. 

Knockout of the mouse apoB gene resulted in embryonic lethality in homozygotes, 
protection against diet-induced hypercholesterolemia in heterozygotes, and developmental 
5 abnormalities in mice. 

Model of insulin resistance, dyslipidemia & over expression of human apoB It was shown 
that the livers of apoB mice assemble and secrete increased numbers of VLDL particles. 

Example 3. Treatment of Diabetes Type-2 with iENA 

1 o Introduction The regulation of hepatic gluconeogenesis is an important process in the 

adjustment of the blood glucose level. Pathological changes in the glucose production of the 
liver are a central characteristic in type-2-diabetes. For example, the fasting hyperglycemia 
observed in patients with type-2-diabetes reflects the lack of inhibition of hepatic 
gluconeogenesis and glycogenolysis due to the underlying insulin resistance in this disease. 

15 Extreme conditions of insulin resistance can be observed for example in mice with a liver- 
specific insulin receptor knockout ('LIRKO'). These mice have an increased expression of 
the two rate-limiting gluconeogenic enzymes, phosphoenolpyruvate carboxykinase (PEPCK) 
and the glucose-6-phosphatase catalytic subunit (G6Pase). Insulin is known to repress both 
PEPCK and G6Pase gene expression at the transcriptional level and the signal transduction 

20 involved in the regulation of G6Pase and PEPCK gene expression by insulin is only partly 
understood. While PEPCK is involved in a very early step of hepatic gluconeogenesis 
(synthesis of phosphoenolpyruvate from oxaloacetate), G6Pase catalyzes the terminal step of 
both, gluconeogenesis and glycogenolysis, the cleavage of glucose-6-phosphate into 
phosphate and free glucose, which is then delivered into the blood stream. 

25 The pharmacological intervention in the regulation of expression of PEPCK and 

G6Pase can be used for the treatment of the metabolic aberrations associated with diabetes. 
Hepatic glucose production can be reduced by an iRNA-based reduction of PEPCK and 
G6Pase enzymatic activity in subjects with type-2-diabetes. 
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Targets for iKNA 

Glucose-6-phosphatase (G6Pase) 

G6Pase mRNA is expressed principally in liver and kidney, and in lower amounts in 
the small intestine. Membrane-bound G6Pase is associated with the endoplasmic reticulum. 
5 Low activities have been detected in skeletal muscle and in astrocytes as well. 

G6Pase catalyzes the terminal step in gluconeogenesis and glycogenolysis. The 
activity of the enzyme is several fold higher in diabetic animals and probably in diabetic 
humans. Starvation and diabetes cause a 2-3-fold increase in G6Pase activity in the liver and 
a 2-4-fold increase in G6Pase mRNA. 

10 

Phosphoenolpyruvate carboxykinase (PEPCK) 

Overexpression of PEPCK in mice results in symptoms of type-2-diabetes mellitus. 
PEPCK overexpression results in a metabolic pattern that increases G6Pase mRNA and 
results in a selective decrease in insulin receptor substrate (IRS)-2 protein, decreased 
1 5 phosphatidylinositol 3-kinase activity, and reduced ability of insulin to suppress 
gluconeogenic gene expression. 



Table 7. Other targets to inhibit hepatic glucose production 



Target 


Comment 


FKHR 


good evidence for antidiabetic phenotype 
(Nakae et al, Nat Genetics 32:245(2002) 


Glucagon 




Glucagon receptor 




Glycogen phosphorylase 




PGC-1 (PPAR-Gamma 
Coactivator) 


regulates the cAMP response (and 
probably the PKB/FKHR-regulation) on 
PEPCK/G6Pase 


Fructose- 1 ,6-bisphosphatase 




Glucose-6-phospate translocator 




Glucokinase inhibitory 
regulatory protein 





20 

Materials and Methods 

Animals: BKS.Cg-m +/+ Lepr db mice, which contain a point mutation in the leptin receptor 
gene are used to examine the efficacy of iRNA for the targets listed above. 



249 



WO 2004/080406 



PCT/US2004/007070 



BKS.Cg-m +/+ Lepr db are available from the Jackson Laboratory (Stock Number 
000642). These animals are obese at 3-4 weeks after birth, show elevation of plasma insulin 
at 10 to 14 days, elevation of blood sugar at 4 to 8 weeks, and uncontrolled rise in blood 
sugar. Exogenous insulin fails to control blood glucose levels and gluconeogenic activity 
5 increases. 

The following numbers of male animals (age>12 weeks) would ideally be tested with 
the following iRNAs: 

PEPCK, 2 sequences, 5 animals per sequence 

G6Pase, 2 sequences, 5 animals per sequence 
10 1 nonspecific sequence, 5 animals 

1 control group (only injected, no siRNA), 5 animals 
1 control group (not injected, no siRNA), 5 animals 

Reagents: Necessary reagents would ideally include a Glucometer Elite XL (Bayer, 
15 Pittsburgh, PA) for glucose quantification, and an Insulin Radioimmunoassay (RIA) kit 
(Amersham, Piscataway, NJ) for insulin quanitation 

Assays: 

G6P enzyme assays and PEPCK enzyme assays are used to measure the activity of the 
20 enzymes. Northern blotting is used to detect levels of G6Pase and PEPCK mRNA. 

Antibody-based techniques (e.g., immunoblotting, immunofluorescence) are used to detect 
levels of G6Pase and PEPCK protein. Glycogen staining is used to detect levels of glycogen 
in the liver. Histological analysis is performed to analyze tissues. 

25 Gene information: 

G6Pase GenBank® No.: NM_008061,Mus museums glucose-6-phosphatase, catalytic 
(G6pc), mRNA 1..2259, ORF 83..1156; 

GenBank® No: U00445,Mus musculus glucose-6-phosphatase mRNA, complete cds 
1..2259, ORF 83..1156 
30 GenBank® No: BC013448 
PEPCK 
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GenBank® No: NM_01 1044, Mus musculus phosphoenolpyruvate carboxykinase 1, 
cytosolic (Pckl), mRNA.1..2618, ORF 141..2009 
GenBank® No: AF009605.1 

5 Administration of iRNA: 

iRNA corresponding to the genes described above would be administered to mice 
with hydrodynamic injection. One control group of animals would be treated with 
Metformin as a positive control for reduction in hepatic glucose levels. 

10 Experimental Protocol 

Mice would be housed in a facility in which there is light from 7:00 AM to 7:00 PM. 
Mice would be fed ad libidum from 7:00 PM to 7:00 AM and fast from 7:00 AM to 7:00 PM. 

Day 0: 7:00 PM: Approximately 100 ul blood would be drawn from the tail. Serum would 
be isolated to measure glucose, insulin, HbAlc (EDTA-blood), glucagon, FFAs, lactate, 
15 corticosterone, serum triglycerides. 

Day 1-7: Blood glucose would be measured daily at 8:00 AM and 6:00 PM (approx. 3-5 
measured with a Haemoglucometer) 

Day 8: Blood glucose would be measured daily at 8:00 AM and 6:00 PM. iRNA would be 
injected between 10:00 AM and 2:00 PM 

20 

Day 9-20: Blood glucose would be measured daily at 8:00 AM and 6:00 PM. 
Day 21: Mice would be sacrificed after 10 hours of fasting. 

Blood would be isolated. Glucose, insulin, HbAlc (EDTA-blood), glucagon, FFAs, lactate, 
25 corticosterone, serum triglycerides would be measured. Liver tissue would be isolated for 
histology, protein assays, RNA assays, glycogen quantitation, and enzyme assays. 
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Example 4: Inhibition of Glucose-6-Phosphatase iRNA in vivo 

iRNA targeted to the Glucose-6-Phosphatase (G6P) gene was used to examine the 
effects of inhibition of G6P expression on glucose metabolism in vivo. 

Female mice, 10 weeks of age, strain BKS.Cg-m +/+ Lepr db (The Jackson 
Laboratory) were used for in vivo analysis of enzymes of the hepatic glucose production. 
Mice were housed under conditions where it was light from 6:30 am to 6:30 pm. Mice were 
fed (ad libidum) during the night period and fasted during the day period. 

On day 1, approximately lOOpJ of blood was collected from test animals by puncturing the 
retroorbital plexus. On days 1-7, blood glucose was measured in blood obtained from tail 
veins (approximately 3-5 ul) using a Glucometer (Elite XL, Bayer). Blood glucose was 
sampled daily at 8 am and 6 pm. 

On day 7 at approximately 2pm, GL3 plasmid (10 ug) and siRNAs (100 jig G6Pase 
specific, Renilla nonspecific or no siRNA control) were delivered to animals using 
hydrodynamic coinjection. 

On day 8, GL3 expression was analyzed by injection of luceferin (3 mg) after 
anaesthesia with avertin and imaging. This was done to control for successful hydrodynamic 
delivery. 

On days 8-10, blood glucose was measured in blood obtained from tail veins 
(approximately 3-5 ml) using a Glucometer (Elite XL, Bayer). 

On day 10, mice were sacrificed after 10 hours of fasting. Blood and liver were 
isolated from sacrificed animals. 

Results: Coinjection of GL3 plasmid and G6Pase iRNA (G6P4) reduced blood 
glucose levels for the short term. Coinjection of GL3 plasmid and Renilla nonspecific iRNA 
had no effect on blood glucose levels. 



252 



WO 2004/080406 PCT/US2004/007070 
Example 5: Selected Palindromic Sequences 

Tables 8-13 below provide selected palindromic sequences from the following genes: human 
ApoB, human glucose-6-phosphatase, rat glucose-6-phosphatase, p-catenin, and hepatitis C 
virus (HCV). 
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Table 


8. Selected palindromic segue 


aces from human Ar>oB 










Source 


Start 


End 




Match 


Start End # 


B 












lndsx lnd©x 




SEQ ID NO: 1 


ggccattccagaagggaag 


509 


528 


SEQ ID NO: 1004 


cttccgttctgtaatggcc 


5795 5814 1 


9 


SEQ ID NO: 2 


tgccatctegagagttcca 


4099 


4118 


SEQ ID NO: 1005 




10876 10895 1 


8 


SEQ ID NO: 3 


catgtcaaacactttgtta 


7056 


7075 


SEQ ID NO: 1006 


[aacaaattccttgacatg 


7358 7377 1 


8 


SEQ ID NO: 4 


tttgttataaatcttattg 


7068 


7087 


SEQ ID NO: 1007 


caataagatcaatagcaaa 


8990 9009 1 


8 


SEQ ID NO: 5 


tctggaaaagggtcatgga 


8880 


8899 


SEQ ID NO: 1008 


tccatgtcccatttacaga 


11356 11375 1 


8 


SEQ ID NO: 6 


cagctcttgttcaggtcca 


10900 


10919 


SEQ ID NO: 1009 


tggacctgcaccaaagctg 


13952 13971 1 


8 


SEQ ID NO: / 


ggaggttccccagctctgc 


356 


375 


SEQ ID NO: 1010 


gcagccctgggaaaactcc 


6447 6466 1 


7 


SEQ ID NO: 8 


ctgttttgaagactctcca 


1081 


1100 


SEQ ID MO: 1011 


tggagggtagtcataacag 


10327 10346 1 


7 


SEQ ID NO: 9 


agtggctgaaacgtgtgca 


1297 


1316 


SEQ ID NO: 1012 


tgcagagctttctgccact 


13508 13527 1 


7 


SEQ ID NO: 10 


ccaaaatagaagggaatct 


2068 


2087 


SEQ ID NO: 1013 


ag attcctttg ccttttgg 


4000 4019 1 


7 


SEQ ID NO: 11 


tgaagagaagattgaattt 


3620 


3639 


SEQ ID NO: 1014 


aaaltctcttttcttttca 


9212 9231 1 


7 


SEQ ID NO: 12 


agtggtggcaacaccagca 


4230 


4249 


SEQ ID NO: 1015 


tgctagtgaggccaacact 


10649 10668 1 


7 


SEQ ID NO: 13 


aaggctccacaagtcatca 


5950 


5969 


SEQ ID NO: 1016 


tgatgatatctggaacctt 


10724 10743 1 


7 


SEQ ID NO: 14 


gtcagccaggtttatagca 


7725 


7744 


SEQ ID NO: 1017 


tgctaagaaccttactgac 


7781 7800 1 


7 


SEQ ID NO' 1 5 


tgatatctggaaccttgaa 


10727 


10746 


SEQ ID NO: 1018 


ttcactgttcctgaaatca 


7863 7882 1 


7 


SEQ ID NO: 16 


gtcaagttgagcaatttct 


13423 


13442 


SEQ ID NO: 1019 


agaaaaggcacaccttgac 


11072 11091 1 


7 


ceo in wn- *i 7 

otU IU INU. 1 1 


atccagatggaaaagggaa 


13480 


13499 


SEQ ID NO: 1020 


ttccaatttccctgtggat 


3680 3699 1 


7 


otU 1U INIJ. 1 0 


atttgtttgtcaaagaagt 


4543 


4562 


SEQ ID NO: 1021 


acttcagagaaatacaaat 


11401 11420 4 


6 




ctggaaaatgtcagcctgg 


204 


223 


SEQ ID NO: 1022 


ccagacttccgtttaccag 


8235 8254 2 


6 


ccn in mo* on 


accaggaggttcttcttca 


1729 


1748 


SEQ ID NO: 1023 


tgaagtgtagtctcctggt 


5089 5108 2 


6 


SEQ ID NO: 21 




1956 


1975 


SEQ ID NO: 1024 


attccatcacaaatccttt 


9661 9680 2 


6 


SEQ ID NO: 22 


gctacagcttatggctcca 


3570 


3589 


SEQ ID NO: 1025 


tggatctaaatgeagtage 


11623 11642 2 


6 




atcaatattgatcaatttg 


6414 


6433 


SEQ ID NO: 1026 


caaagaagtcaagattgat 


4553 4572 2 


6 


SEQ ID NO: 24 


gaattatcttttaaaacat 


7326 


7345 


SEQ ID NO: 1027 


atgtgttaacaaaatattc 


11494 11513 2 


6 


CC/*i m MO- OR 


cgaggcccgcgctgctggc 


130 


149 


SEQ ID NO: 1028 


gccagaagtgagatcctcg 


3507 3526 1 


6 


SEQ ID NO: 26 


acaactatgaggctgagag 


271 


290 


SEQ ID NO: 1029 


ctctgagcaacaaatttgt 


10309 10328 1 


6 


SEQ ID NO: 27 


gctgagagttccagtggag 


282 


301 


SEQ ID NO: 1030 


ctccatggcaaatgtcagc 


10885 10904 1 


6 


SEQ ID NO: 28 




448 


467 


SEQ ID NO: 1031 


gagtcattgaggttcttca 


4929 4948 1 


6 


SEQ ID NO: 29 


cctacttacatcctgaaca 


558 


577 


SEQ ID NO: 1032 


tgttcataagggaggtagg 


12766 12785 1 


6 


SEQ ID NO: 30 


ctacttacatcctgaacat 


559 


578 


SEQ ID NO: 1033 


atgttcataagggaggtag 


12765 12784 1 


6 


SEQ ID NO: 31 


gagacagaagaagecaa 


JC615 


634 


SEQ ID NO: 1034 


gcttggttttgccagtctc 


2459 2478 1 


6 


SEQ ID NO: 32 


cactcactttaccgtcaag 


671 


690 


SEQ ID NO: 1035 


cttgaacacaaagtcagtg 


6000 6019 1 


6 


SEQ ID NO: 33 


ctgatcagcagcagccagt 


822 


841 


SEQ ID NO: 1036 


actgggaagtgcttatcag 


5237 5256 1 


6 


SEQ ID NO: 34 


actggacgctaagaggaa 


3 854 


873 


SEQ ID NO: 1037 


cttccccaaagagaccagt 


2890 2909 1 


6 
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SEQ ID NO: 35 


agaggaagcatgtggcaga 


865 


884 


SEQ ID NO: 1038 


tctggcatttactttctct 


5921 


5940 1 


6 


SEQ ID NO: 36 


tgaagactctccaggaact 


1087 


1106 


SEQ ID NO: 1039 


agttgaaggagactattca 


7216 


7235 1 


6 


SEQ ID NO: 37 


ctctgagcaaaatatccag 


1121 


1140 


SEQ ID NO: 1040 


ctggttactgagctgagag 


1161 


1180 1 


6 


SEQ ID NO: 38 


atgaagcagtcacatctct 


1189 


1208 


SEQ ID NO: 1041 


agagctgccagtccttcat 


10016 


10035 1 


6 


SEQ ID NO: 39 


ttgccacagctgattgagg 


1209 


1228 


SEQ ID NO: 1042 


cctcctacagtggtggcaa 


4222 


4241 1 


6 


SEQ ID NO: 40 


agctgattgaggtgtccag 


1216 


1235 


SEQ ID NO: 1043 


ctggattccacatgcagct 


11847 


11866 1 


6 


SEQ ID NO: 41 


tgctccactcacatcctco 


1278 


1297 


SEQ ID NO: 1044 


ggaggctttaagttcagca 


7601 


7620 1 


6 


SEQ ID NO: 42 


tgaaacgtgtgcatgccaa 


1303 


1322 


SEQ ID NO: 1045 


ttgggagagacaagtttca 


6500 


6519 1 


6 


SEQ ID NO: 43 


gacattgctaattacctga 


1503 


1522 


SEQ ID NO: 1046 


tcagaagctaagcaatgtc 


7232 


7251 1 


6 


SEQ ID NO: 44 


ttcttcttcagactttcct 


1738 


1757 


SEQ ID NO: 1047 


aggagagtccaaattagaa 


8498 


8517 1 


6 


SEQ ID NO: 45 


ccaatatcttgaactcaga 


1903 


1922 


SEQ ID NO: 1048 


tctgaattcattcaattgg 


6485 


6504 1 


6 


SEQ ID NO: 46 


aaagttagtgaaagaagtt 


1946 


1965 


SEQ ID NO: 1049 


aactaccctcactgccttt 


2132 


2151 1 


6 


SEQ ID NO: 47 


aagttagtgaaagaagttc 


1947 


1966 


SEQ ID NO: 1050 


gaacctctggcatttactt 


5916 


5935 1 


6 


SEQ ID NO: 48 


aaagaagttctgaaagaat 


1956 


1975 


SEQ ID NO: 1051 


attctctggtaactacttt 


5482 


5501 1 


6 


SEQ ID NO: 49 


tttggctataccaaagatg 


2322 


2341 


SEQ ID NO: 1052 


catcttaggcactgacaaa 


4997 


5016 1 


6 


SEQ ID NO: 50 


tgttgagaagctgattaaa 


2381 


2400 


SEQ ID NO: 1053 


tttagccatcggctcaaca 


5700 


5719 1 


6 


SEQ ID NO: 51 


caggaagggctcaaagaat 


2561 


2580 


SEQ ID NO: 1054 


attcctttaacaattcctg 


9492 


9511 1 


6 


SEQ ID NO: 52 


aggaagggctcaaagaatg 


2562 


2581 


SEQ ID NO: 1055 


cattcctttaacaattcct 


9491 


9510 1 


6 


SEQ ID NO: 53 


gaagggctcaaagaatgac 


2564 


2583 


SEQ ID NO: 1056 


gtcagtcttcaggctcttc 


7914 


7933 1 


6 


SEQ ID NO: 54 


caaagaatgacttttttct 


2572 


2591 


SEQ ID NO: 1057 


agaaggatggcattttttg 


14000 14019' 1 


6 


SEQ ID NO: 55 


catggagaatgcctttgaa 


2603 


2622 


SEQ ID NO: 1058 


ttcagagccaaagtccatg 


7119 


7138 1 


6 


SEQ ID NO: 56 


ggagccaaggctggagtaa 


2679 


2698 


SEQ ID NO: 1059 


ttactccaacgccagctcc 


3050 


3069 1 


6 


SEQ ID NO: 57 


tcattccttccccaaagag 


2884 


2903 


SEQ ID NO: 1060 


ctctctggggcatctatga 


5139 


5158 1 


6 


SEQ ID NO: 58 


acctatgagctccagagag 


3165 


3184 


SEQ ID NO: 1061 


ctctcaagaccacagaggt 


12976 12995 1 


6 


SEQ ID NO: 59 


gggcaaaacgtcttacaga 


3365 


3384 


SEQ ID NO: 1062 


tctgaaagacaacgtgccc 


12317 


12336 1 


6 


SEQ ID NO: 60 


accctggacattcagaaca 


3387 


3406 


SEQ ID NO: 1063 


tgttgctaaggttcagggt 


5675 


5694 1 


6 


SEQ ID NO: 61 


atgggcgacctaagttgtg 


3429 


3448 


SEQ ID NO: 1064 


cacaaattagtttcaccat 


8941 


8960 1 


6 


SEQ ID NO: 62 


gatgaagagaagattgaat 


3618 


3637 


SEQ ID NO: 1065 


attccagcttccccacatc 


8330 


8349 1 


6 


SEQ ID NO: 63 


caatgtagataccaaaaaa 


3656 


3675 


SEQ ID NO: 1066 


ttttttggaaatgccattg 


8643 


8662 1 


6 


SEQ ID NO: 64 


gtagataccaaaaaaatga 


3660 


3679 


SEQ ID NO: 1067 


tcatgtgatgggtctctac 


4371 


4390 1 


6 


SEQ ID NO: 65 


gcttcagttcatttggact 


4509 


4528 


SEQ ID NO: 1068 


agtcaagaaggacttaagc 


5304 


5323 1 


6 


SEQ ID NO: 66 


tttgtttgtcaaagaagtc 


4544 


4563 


SEQ ID NO: 1069 


gacttcagagaaatacaaa 


1140C 


11419 1 


6 


SEQ ID NO: 67 


ttgtttgtcaaagaagtca 


4545 


4564 


SEQ ID NO: 1070 


tgacttcagagaaatacaa 


1139S 


11418 1 


6 


SEQ ID NO: 68 


tggcaatgggaaactcgct 


5846 


5865 


SEQ ID NO: 1071 


agcgagaatcaccctgcca 


8219 


8238 1 


6 


SEQ ID NO: 69 


aacctctggcatttacttt 


5917 


5936 


SEQ ID NO: 1072 


aaaggagatgtcaagggtt 


1059S 


10618 1 


6 


SEQ ID NO: 70 


catttactttctctcatga 


5926 


5945 


SEQ ID NO: 1073 


tcatttgaaagaataaatg 


7026 


7045 1 


6 


SEQ ID NO: 71 


aaagtcagtgccctgctta 


3009 


6028 


SEQ ID NO: 1074 


taagaaccttactgacttt 


7784 


7803 1 


6 



WO 2004/080406 



PCT/US2004/007070 



SEQ ID NO: 72 


tcccattttttgag a cctt 


6322 


6341 


SEQ ID NO: 1075 


aaggacttcaggaatggga 


12004 12023 1 


6 


SEQ ID NO: 73 


catcaatattg ate a attt 


6413 


6432 


SEQ ID NO: 1076 


aaattaaaaagtcttgatg 


6732 6751 1 


6 


ocn in Mn» ia 








SEQ ID NO: 1077 


taaaccaaaacttggttta 


9019 9038 1 


6 


SEQ ID NO - 75 


tettgatglaateattlaa 


6713 


6732 


SEQ ID NO: 1078 


ttcaaagacttaaaaaata 


8007 8026 1 


6 


opn in Mn* 


atg atct acatttgtttat 


6790 




SEQ ID NO: 1079 


ataaagaaattaaagtcat 


7380 7399 1 


6 


ppn in wn* 77 






6938 


SEQ ID NO: 1080 


atatattgtcagtgcctct 


13382 13401 1 


6 


SEQ ID NO' 78 


gacacatacagaatataga 


6922 


6941 


SEQ ID NO: 1081 


tctaaattcagttcttgtc 


11327 11346 1 


6 


SEQ ID NO: 79 


acjcatcjtcaaacactttgt 


7054 


7073 


SEQ ID NO: 1082 


acaaagtcagtgccctgct 


6007 6026 1 


6 


cert in wa- ro 


tttttagaggaaaccaagg 






SEQ ID NO: 1083 


cctttgtgtacaccaaaaa 


11230 11249 1 


6 


cm m wn- r-i 




7516 


7535 


SEQ ID NO: 1084 




11229 11248 1 


6 


OCLW IU IMW. 0.£ 


ggtJgateTa^crtgJa 




9326 


SEQ ID NO: 1085 


ttcagaaatactgttttcc 


12824 12843 1 


6 


SEQ ID NO: 83 


cactgtttctgagtcccag 


9334 


9353 


SEQ ID NO: 1086 


ctgggacctaccaagagtg 


12523 12542 1 


6 


SEQ ID NO' 84 


cacaaatcctttggctgtg 


9668 


9687 


SEQ ID NO: 1087 


cacatttcaaggaattgtg 


10063 10082 1 


6 


ocn m Kiev r*-*. 








SEQ ID NO: 1088 


ggaactgttgactcaggaa 


12569 12588 1 


6 


opn m Mn- rr 


gaaateLaagctLtct 


10042 


10061 


SEQ ID NO: 1089 


agagccaggtcgagctttc 


11044 11063 1 


6 


SEQ ID NO" 87 


tttcttcatcttcatctgt 


10210 


10229 


SEQ ID NO: 1090 


acagctgaaagagatgaaa 


13055 13074 1 


6 


SEQ ID NO: 88 


tctaccgctaaaggagcag 


10521 


10540 


SEQ ID NO: 1091 


ctgcacgctttgaggtaga 


11761 11780 1 


6 


otu iu ay 


ctaccgctaaaggagcagt 






SEQ ID NO: 1092 


actgcacgctttgaggtag 


11760 11779 1 


6 


ocn in Mn- on 
otrw iu i>ivj. yu 


agggcctctttttcaccaa 


10831 


10850 


SEQ ID NO: 1093 


ttggccaggaagtggccct 


10957 10976 1 


6 


SEQ ID NO' 91 


c cca ccc g aaaag 


11265 


11284 


SEQ ID NO: 1094 


ctttttcaccaacggagaa 


10838 10857 1 


6 


ocn in Mn- qo 


gaaaaacaaagcagattat 


11816 


11835 


SEQ ID NO: 1095 


ataaactgeaagattttte 


13600 13619 1 


6 


SEQ ID NO - 93 


actcactcattgattttct 


12682 


12701 


SEQ ID NO: 1096 


agaaaatcaggatctgagt 


14027 14046 1 


6 


ccn in Mn- oa 








SEQ ID NO: 1097 


gattaccaccagcagttta 


13578 13597 1 


6 


SEQ ID NO' 95 


iTaTacgagcttcaggaag 


13200 


13219 


SEQ ID NO: 1098 


cttcgtgaagaatattttg 


13260 13279 1 


6 


SEQ ID NO: 96 


tggaataatgctcagtgtt 


2366 


2385 


SEQ ID NO: 1099 


aacacttacttgaattcca 


10662 10681 3 


5 


ecn in Mn- Q7 
ocu iu invj. y/ 


gatttgaaatccaaagaag 


2400 


2419 


SEQ ID NO: 1100 


cttcagagaaatacaaatc 


11402 11421 3 


5 


SEQ ID NO' 98 




2401 


2420 


SEQ ID NO: 1101 


acttcagagaaatacaaat 


11401 11420 3 


5 


SEQ ID NO: 99 


L!Icagccgcttcmg 9 


990 


1009 


SEQ ID NO: 1102 


caaagaagtcaagattgat 


4553 4572 2 


5 


SEQ ID NO: 100 


tgttttgaagactctccag 


1082 


1101 


SEQ ID NO: 1103 


ctggaaagttaaaacaaca 


6955 6974 2 


5 


SEQ ID NO: 101 


cccttctgatagatgtggt 


1324 


1343 


SEQ ID NO: 1104 


accaaagctggcaccaggg 


13961 13980 2 


5 


SEQ ID NO: 102 


tgagcaagtgaagaacttt 


1868 


1887 


SEQ ID NO: 1105 


aaagccattcagtctctca 


12963 12982 2 


5 


SEQ ID NO: 103 


atttgaaatccaaagaagt 


2401 


2420 


SEQ ID NO: 1106 


acttttctaaacttgaaat 


9055 9074 2 


5 


SEQ ID NO: 104 


atccaaagaagtcccggaa 


2408 


2427 


SEQ ID NO: 1107 


ttccggggaaacctgggat 


12721 12740 2 


5 


SEQ ID NO: 105 


ag ag cctacctccg catct 


2430 


2449 


SEQ ID NO: 1108 


agatggtacgttagcctct 


11921 11940 2 


5 


SEQ ID NO: 106 


aatgcctttgaactcccca 


2610 


2629 


SEQ ID NO: 1109 


tgg gaactacaatttcatt 


7012 7031 2 


5 


SEQ ID NO: 107 


gaagtccaaattccggatt 


3297 


3316 


SEQ ID NO: 1110 


aatcttcaatttattcttc 


13815 13834 2 


5 


SEQ ID NO: 108 


igcaagcagaagccagaa 


3 3496 


3515 


SEQ ID NO: 1111 


cttcaggttccatcgtgca 


11376 11395 2 


5 


SEQ ID NO: 109 


gaagagaagattgaatttg 


3621 


3640 


SEQ ID NO: 1112 


caaaacctactgtctcttc 


10459 10478 2 


5 
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SEQ ID NO: 110 


atgctaaaggcacatatgg 


4597 


4616 


SEQ ID NO: 1113 


ccatatgaaagtcaagcat 


12656 


12675 2 


5 


SEQ ID NO: 111 


tccctcacctccacctctg 


4737 


4756 


SEQ ID NO: 1114 


cagattctcagatgaggga 


8912 


8931 2 


5 


SEQ ID NO: 112 


atttacagctctgacaagt 


5427 


5446 


SEQ ID NO: 1115 


acttttctaaacttgaaat 


9055 


9074 2 


5 


SEQ ID NO: 113 


aggagcctaccaaaataat 


5594 


5613 


SEQ ID NO: 1116 


attatgttgaaacagtcct 


11830 11849 2 


5 


SEQ ID NO: 114 


aaagctgaagcacatcaat 


6401 


6420 


SEQ ID NO: 1117 


attgttgctcatctccttt 


10194 


10213 2 


5 


SEQ ID NO: 115 


ctgctggaaacaacgagaa 


9418 


9437 


SEQ ID NO: 1118 


ttctgattaccaccagcag 


13574 13593 2 


5 


SEQ ID NO: 116 


ttgaaggaattcttgaaaa 


9582 


9601 


SEQ ID NO: 1119 


tttlaaaagaaatcttcaa 


13805 


13824 2 


5 


SEQ ID NO: 117 


gaagtaaaagaaaattttg 


10743 10762 


SEQ ID NO: 1120 


caaaacctactgtctcttc 


10459 


10478 2 


5 


SEQ ID NO: 118 


tgaagaagatggcaaattt 


11984 


12003 


SEQ ID NO: 1121 


aaatgtcagctcttgttca 


10894 10913 2 


5 


SEQ ID NO: 119 


aggatctgagttattttgc 


14035 


14054 


SEQ ID NO: 1122 


gcaagtcagcccagttcct 


10920 


10939 2 


5 


SEQ ID NO: 120 


gtgcccttctcggttgctg 


18 


37 


SEQ ID NO: 1123 


cagccattgacatgagcac 


5740 


5759 1 


5 


SEQ ID NO: 121 


ggcgctgcctgcgctgctg 


146 


165 


SEQ ID NO: 1124 


cagctccacagactccgcc 


3062 


3081 1 


5 


SEQ ID NO: 122 


ctgcgctgctgctgctgct 


154 


173 


SEQ ID NO: 1125 


agcagaaggtgcgaagcag 


3224 


3243 1 


5 


SEQ ID NO: 123 


gctgctggcgggcgccagg 


170 


189 


SEQ ID NO: 1126 


cctggattccacatgcagc 


11846 11865 1 


5 


SEQ ID NO: 124 


aagaggaaatgctggaaaa 


193 


212 


SEQ ID NO: 1127 


tttttcttcactacatctt 


2584 


2603 1 


5 


SEQ ID NO: 125 


ctggaaaatgtcagcctgg 


204 


223 


SEQ ID NO: 1128 


ccagacttccacatcccag 


3915 


3934 1 


5 


SEQ ID NO: 126 


tggagtccctgggactgct 


296 


315 


SEQ ID NO: 1129 


agcatgcctagtttctcca 


9945 


9964 1 


5 


SEQ ID NO: 127 


ggagtccctgggactgctg 


297 


316 


SEQ ID NO: 1130 


cagcatgcctagtttctcc 


9944 


9963 1 


5 


SEQ ID NO: 128 


tgggactgctgattcaaga 


305 


324 


SEQ ID NO: 1131 


tcttccatcacttgaccca 


2042 


2061 1 


5 


SEQ ID NO: 129 


ctgctgattcaagaagtgc 


310 


329 


SEQ ID NO: 1132 


gcacaccttgacattgcag 


11079 


11098 1 


5 


SEQ ID NO: 130 


tgccaccaggatcaactgc 


326 


345 


SEQ ID NO: 1133 


gcaggctgaactggtggca 


2717 


2736 1 


5 


SEQ ID NO: 131 


gccaccaggatcaactgca 


327 


346 


SEQ ID NO: 1134 


tgcaggctgaactggtggc 


2716 


2735 1 


5 


SEQ ID NO: 132 


tgcaaggttgagctggagg 


342 


361 


SEQ ID NO: 1135 


cctccacctctgatctgca 


4744 


4763 1 


5 


SEQ ID NO: 133 


caaggttgagctggaggtt 


344 


363 


SEQ ID NO: 1136 


aacccctacatgaagcttg 


13755 13774 1 


5 


SEQ ID NO: 134 


ctctgcagcttcatcctga 


369 


388 


SEQ ID NO: 1137 


tcaggaagcttctcaagag 


13211 


13230 1 


5 


SEQ ID NO: 135 


cagcttcatcctgaagacc 


374 


393 


SEQ ID NO: 1138 


ggtcttgagttaaatgctg 


4977 


4996 1 


5 


SEQ ID NO: 136 


gcttcatcctgaagaccag 


376 


395 


SEQ ID NO: 1139 


ctggacgctaagaggaagc 


855 


874 1 


5 


SEQ ID NO: 137 


tcatcctgaagaccagcca 


379 


398 


SEQ ID NO: 1140 


tggcatggcattatgatga 


3604 


3623 1 


5 


SEQ ID NO: 138 


gaaaaccaagaactctgag 


452 


471 


SEQ ID NO: 1141 


ctcaaccttaatgattttc 


8286 


8305 1 


5 


SEQ ID NO: 139 


agaactctgaggagtttgc 


460 


479 


SEQ ID NO: 1142 


gcaagctatacagtattct 


8377 


8396 1 


5 


SEQ ID NO: 140 


tctgaggagtttgctgcag 


465 


484 


SEQ ID NO: 1143 


ctgcaggggatcccccaga 


2526 


2545 1 


5 


SEQ ID NO: 141 


tttgctgcagccatgtcca 


474 


493 


SEQ ID NO: 1144 


tggaagtgtcagtggcaaa 


10372 


10391 1 


5 


SEQ ID NO: 142 


caagaggggcatcatttct 


578 


597 


SEQ ID NO: 1145 


agaataaatgacgttcttg 


7035 


7054 1 


5 


SEQ ID NO: 143 


tcactttaccgtcaagacg 


674 


693 


SEQ ID NO: 1146 


cgtctacactatcatgtga 


4360 


4379 1 


5 


SEQ ID NO: 144 


tttaccgtcaagacgagga 


678 


697 


SEQ ID NO: 1147 


tccttgacatgttgataaa 


7366 


7385 1 


5 


SEQ ID NO: 145 


cactggacgctaagaggaa 


853 


872 


SEQ ID NO: 1148 


ttccagaaagcagccagtg 


12498 


12517 1 


5 


SEQ ID NO: 146 


aggaagcatgtggcagaac 


867 


886 


SEQ ID NO: 1149 


cttcatacacattaatcct 


9988 


10007 1 


5 


SEQ ID NO: 147 


caaggagcaacacctcttc 


893 


912 


SEQ ID NO: 1150 


gaagtagtactgcatcttg 


6835 


6854 1 


5 
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SfcU ID NO: 148 


acagactttgaaacttgaa 


959 


978 


SEQ ID NO: 1151 


ttcaattcttcaatgctgt 


10500 


10519 1 


5 


SEQ ID NO: 149 


tgatgaagcagtcacatct 


1187 


1206 


SEQ ID NO: 1152 


agatttgaggattccatca 


7976 


7995 1 


5 


SEQ ID NO: 150 




1193 


1212 


SEQ ID NO: 1153 


caaggagaaactgactgct 


6524 


6543 1 


5 


SEQ ID NO: 151 


ccagccccatcactttaca 


1231 


1250 


SEQ ID NO: 1154 


tgtagtctcctggtgctgg 


5094 


5113 1 


5 


SEQ ID NO: 152 


ctccactcacatcctccag 


1280 


1299 


SEQ ID NO: 1155 


ctggagcttagtaatggag 


8709 


8728 1 


5 


SEQ ID NO: 153 


catgccaacccccttctga 


1314 


1333 


SEQ ID NO: 1156 


tcagatgagggaacacatg 


8919 


8938 1 


5 


SEQ ID NO: 154 


gagagatcttcaacatggc 


1390 


1409 


SEQ ID NO: 1157 


gccaccctggaactctctc 


10869 


10888 1 


5 


SEQ ID NO: 155 


tcaacatggcgagggatoa 


1399 


1418 


SEQ ID NO: 1158 


tgatcccacctctcattga 


2965 


2984 1 


5 


SEQ ID NO: 156 


ccaccttgtatgcgctgag 


1429 


1448 


SEQ ID NO: 1159 


ctcagggatctgaaggtgg 


8187 


8206 1 


5 


SEQ ID NO: 157 


gtcaacaactatcataaga 


1455 


1474 


SEQ ID NO: 1160 


tcttgagttaaatgctgac 


4979 


4998 1 


5 


SEQ ID NO: 158 


tggacattgctaattacct 


1501 


1520 


SEQ ID MO: 1161 


aggtatattcgaaagtcca 


12799 


12818 1 


5 


SEQ ID NO: 159 


ggacattgotaattacctg 


1502 


1521 


SEQ ID NO: 1162 




12798 


12817 1 


5 


SEQ ID NO: 160 


ttctgcgggtcattggaaa 


1573 


1592 


SEQ ID NO: 1163 


tttcacatgccaaggagaa 


6514 


6533 1 


5 


SEQ ID NO: 161 


ccagaactcaagtcttcaa 


1620 


1639 


SEQ ID NO: 1164 


ttgaagtgtagtctcctgg 


5088 


5107 1 


5 


SEQ ID NO: 162 


agtcttcaatcctgaaatg 


1630 


1649 


SEQ ID NO: 1165 


catttctgattggtggact 


7757 


7776 1 


5 


SEQ ID NO: 163 


tgagcaagtgaagaacttt 


1868 


1887 


SEQ ID NO: 1166 


aaagtgccacttttactca 


6183 


6202 1 


5 


SEQ ID NO: 164 


agcaagtgaagaactttgt 


1870 


1889 


SEQ ID NO: 1167 


acaaagtcagtgccctgct 


6007 


6026 1 


5 


SEQ ID NO: 165 


tctgaaagaatctcaactt 


1964 


1983 


SEQ ID NO: 1168 


aagtccataatggttcaga 


12811 


12830 1 


5 


SEQ ID NO: 166 


actgtcatggacttcagaa 


1986 


2005 


SEQ ID NO: 1169 


ttctgaatatattgtcagt 


13376 


13395 1 


5 


SEQ ID NO: 167 


acttgacccagcctcagcc 


2051 


2070 


SEQ ID NO: 1170 


ggctcaccctgagagaagt 


12391 


12410 1 


5 


SEQ ID NO: 168 


tccaaataaotaccttcct 


2096 


2115 


SEQ ID NO: 1171 


aggaagatatgaagatgga 


4712 


4731 1 


5 


SEQ ID NO: 169 


aotaccctcactgcctttg 


2133 


2152 


SEQ ID NO: 1172 


caaatttgtggagggtagt 


10319 10338 1 


5 


SEQ ID NO: 170 


ttggatttgcttcagctga 


2149 


2168 


SEQ ID NO: 1173 


tcagtataagtacaaccaa 


9392 


9411 1 


5 


SEQ ID NO: 171 


ttggaagctctttttggga 


2211 


2230 


SEQ ID NO: 1174 


tcccgattcacgcttccaa 


11577 11596 1 


5 


SEQ ID NO: 172 


ggaagctctttttgggaag 


2213 


2232 


SEQ ID NO: 1175 


cttcagaaagctaccttcc 


7929 


7948 1 


5 


SEQ ID NO: 173 


tttttcccagacagtgtca 


2238 


2257 


SEQ ID NO: 1176 


tgaccttctctaagcaaaa 


4876 


4895 1 


5 


SEQ ID NO: 174 


agacagtgtcaacaaagct 


2246 


2265 


SEQ ID NO: 1177 


agcttggttttgccagtct 


2458 


2477 1 


5 


SEQ ID NO: 175 


ctttggctataccaaagat 


2321 


2340 


SEQ ID NO: 1178 


atctcgtgtctaggaaaag 


5968 


5987 1 


5 


SEQ ID NO: 176 


caaagatgataaacatgag 


2333 


2352 


SEQ ID NO: 1179 


ctcaaggataacgtgtttg 


12609 


12628 1 


5 


SEQ ID NO: 177 


gatatggtaaatggaataa 


2355 


2374 


SEQ ID NO: 1180 


ttatcttattaattatatc 


13079 


13098 1 


5 


SEQ ID NO: 178 


ggaataatgctcagtgttg 


2367 


2386 


SEQ ID NO: 1181 


caacacttacttgaattcc 


10661 


10680 1 


5 


SEQ ID NO: 179 


tttgaaatccaaagaagtc 


2402 


2421 


SEQ ID NO: 1182 


gacttcagagaaatacaaa 


11400 


11419 1 


5 


SEQ ID NO: 180 


gatcccccagatgattgga 


2534 


2553 


SEQ ID NO: 1183 


tccaatttccctgtggatc 


3681 


3700 1 


5 


SEQ ID NO: 181 


cagatgattggagaggtca 


2541 


2560 


SEQ ID NO: 1184 


tgaccacacaaacagtctg 


5363 


5382 1 


5 


SEQ ID NO: 182 


agaatgacttttttcttca 


2575 


2594 


SEQ ID NO: 1185 


tgaagtccggattcattct 


11015 11034 1 


5 


SEQ ID NO: 183 


gaactccccactggagctg 


2619 


2638 


SEQ ID NO: 1186 


cagctcaaccgtacagttc 


11861 


11880 1 


5 


SEQ ID NO: 184 


atatcttcatctggagtca 


2652 


2671 


SEQ ID NO: 1187 


tgacttcagtgcagaatat 


11966 


11985 1 


5 


SEQ ID NO: 185 


gtcattgctcccggagcca 


2667 


2686 


SEQ ID NO: 1188 


tggccccgtttaccatgac 


5809 


5828 1 


5 
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SEQ ID NO: 186 


g ctg a a g tttatc attc ct 


2873 


2892 


SEQ ID NO: 1189 


aggaggctttaagttcagc 


7600 7619 1 


5 


SEQ ID NO' 187 








SEQ ID NO: 1190 


gtctcttcctccatggaat 


10470 10489 1 


5 


SEQ ID NO: 188 


cca gagaacaggcag 


2976 


2995 


SEQ ID NO: 1191 


actgactgcacgctttgag 


11756 11775 1 


5 


SEQ ID NO' 189 








SEQ ID NO: 1192 


ctgagagaagtgtcttcaa 


12399 12418 1 


5 


SEQ ID NO" 190 


acc gtccag gaag cc 


3285 


3304 


SEQ ID NO: 1193 




12784 12803 1 


5 


SEQ ID NO" 191 




3292 


3311 


SEQ ID NO: 1194 


ggaaggcagagtttactgg 


9148 9167 1 


5 


SEQ ID NO: 192 


acattigaacaagaaaat 


3394 


3413 


SEQ ID NO: 1195 


atttcctaaagctggatgt 


11167 11186 1 


5 


SEQ ID NO: 193 


gaaaaatcaagggtgttat 


3463 


3482 


SEQ ID NO: 1196 


ataaactgeaagattttte 


13600 13619 1 


5 


SEQ ID NO: 194 


aaatcaagggtgttatttc 


3466 


3485 


SEQ ID NO: 1197 


gaaacaatgcattagattt 


9745 9764 1 


5 


SEQ ID NO: 195 


ggca agagaagaga 




3628 


SEQ ID NO: 1198 


tctcccgtgtataatgcca 


11781 11800 1 


5 


SEQ ID NO: 196 




3622 


3641 


SEQ ID NO: 1199 


tcaaaacctactgtctctt 


10458 10477 1 


5 


SEQ ID NO: 197 


I^gTcttccaaWccc 3 


3673 


3692 


SEQ ID NO: 1200 


gggaactacaatttcattt 


7013 7032 1 


5 


SEQ ID NO: 198 


atgacttccaatttccctg 


3675 


3694 


SEQ ID NO: 1201 


caggctgattacgagtcat 


4917 4936 1 


5 


SEQ ID NO: 199 


acttccaatttccctgtgg 


3678 


3697 


SEQ ID NO: 1202 


ccacgaaaaatatggaagt 


10360 10379 1 


5 


SEQ ID NO' 200 


ag tgcaa gage ca gg 




3822 


SEQ ID NO: 1203 


ccatcagttcagataaact 


7989 8008 1 


5 


SEQ ID NO: 201 


gcaagaccacc caa 


3860 


3879 


SEQ ID NO: 1204 


attgacctgtecattcaaa 


13671 13690 1 


5 


SEQ ID NO: 202 


gaaggag caacc ccag 


3884 


3903 


SEQ ID NO: 1205 


ctgg aattg tcattccttc 


11728 11747 1 


5 


SEQ ID NO' 203 








SEQ ID NO: 1206 


ttttaacaaaagtggaagt 


6821 6840 1 


5 


SEQ ID NO' 204 


TttT^aamr 3 

c c c aaaaagega g 


3939 


3958 


SEQ ID NO: 1207 


catcactgccaaaggagag 


8486 8505 1 


5 






3948 


3967 


SEQ ID NO: 1208 


tgactcactcattgatttt 


12680 12699 1 


5 


SEQ ID NO: 206 


ttccttt cctttt 03 
cc gec gg 


4003 


4022 


SEQ ID NO: 1209 


ccacaaacaatgaagggaa 


9256 9275 1 


5 


SEQ ID NO" 207 


caagtctgtgggattcca 






SEQ ID NO: 1210 




9566 9585 1 


5 


SEQ ID NO' 208 


aag ccc ac acca 


4117 


4136 


SEQ ID NO: 1211 


atgggaagtataagaactt 


4834 4853 1 


5 


cpn in Mri' or\Q 
ocu it-* invj. zuy 


9 ° C CCC "" ° 


4159 




SEQ ID NO: 1212 


agaaaaacaaacacaggc 


a 9643 9662 1 


5 


otW IU INU. c. IU 




4242 


4261 


SEQ ID NO: 1213 


tgaagtgtagtctcctggt 


5089 5108 1 


5 


SEQ ID NO' 21 1 


ccagcacagScclttcag 


4243 


4262 


SEQ ID NO: 1214 


ctgaaatacaatgctctgg 


5511 5530 1 


5 


SEQ ID NO: 212 


actatcatgtgatgggtct 


4367 


4386 


SEQ ID NO: 1215 


agacacctgattttatagt 


7948 7967 1 


5 


SEQ ID NO: 213 


accacagatgtctgcttca 


4496 


4515 


SEQ ID NO: 1216 


tgaaggctgactctgtggt 


4282 4301 1 


5 


SEQ ID NO: 214 


ccacagatgtctgcttcag 


4497 


4516 


SEQ ID NO: 1217 


ctgagcaacaaatttgtgg 


10311 10330 1 


5 


SEQ ID NO: 215 


tttggactccaaaaagaaa 


4520 


4539 


SEQ ID NO: 1218 


tttctctcatgattacaaa 


5933 5952 1 


5 


SEQ ID NO: 216 


tcaaag aagtcaagattga 


4552 


4571 


SEQ ID NO: 1219 


tcaaggataacgtgtttga 


12610 12629 1 


5 


SEQ ID NO: 217 


atgagaactacgagctgac 


4798 


4817 


SEQ ID NO: 1220 


gtcagatattgttgctcat 


10187 10206 1 


5 


SEQ ID NO: 218 


ttaaaatctgacaccaatg 


4818 


4837 


SEQ ID NO: 1221 


cattcattgaagatgttaa 


7342 7361 1 


5 


SEQ ID NO: 219 


gaagtataagaactttgee 


4838 


4857 


SEQ ID NO: 1222 


ggcaaatttgaaggacttc 


11994 12013 1 


5 


SEQ ID NO: 220 


aagtataagaactttgeca 


4839 


4858 


SEQ ID NO: 1223 


tggcaaatttgaaggactt 


11993 12012 1 


5 


SEQ ID NO: 221 


ttcttcagcctgctttctg 


4941 


4960 


SEQ ID NO: 1224 


cagaatccagatacaagaa 


6884 6903 1 


5 


SEQ ID NO: 222 


ctggatcactaaattccca 


4957 


4976 


SEQ ID NO: 1225 


tgggtctttccagagccag 


11033 11052 1 


5 


SEQ ID NO: 223 


aaattaatagtggtgctca 


5014 


5033 


SEQ ID NO: 1226 




6248 6267 1 


5 
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SEQ ID NO; 224 


agtgcaacgaccaacttga 


5073 


5092 


OtU ID InU. £.£0 


ctgggaagtgcttatcagg 


5238 


5257 


ccn in ma* 99£ 




5278 


5297 


cjcri in mi*y 097 


aaaaacaTOcaacttca 


5280 


5299 


SEQ ID NO: 228 


tcagtcaagaag g acttaa 


5302 


5321 


SEQ ID NO: 229 


t c aastg acatg atg ggct 


5325 


5344 


ccn in h\r\' 9^n 




5367 


5386 


ocu iu row. i 


tottaaTaTcttgaSaTa 03 


5409 


5428 


SEQ ID NO: 232 


caagttttataagcaaact 


5441 


5460 


SEQ ID NO: 233 


tggtaactactttaaacag 


5488 


5507 


SEQ ID NO: 234 


aacagtgacctgaaalaca 


5502 


5521 


SEQ ID NO' 235 


gg g a aactaegg ctag aac 


5544 


5563 






5620 


5639 


ocn in MA* 9^7 


tea^ 


5652 


5671 


SEQ ID NO: 238 


gcagacactgttgctaagg 


5667 


5686 




tctggggagaacatactgg 


5866 


5885 


ecn m ma* 94n 




5934 


5953 


otrU IU l\VJ. I 


Itgaaragacaggcacctg 


6034 


6053 


ocn in mo- 949 


caatttaacaacaatgaat 


6066 


6085 




tggacgaactctggctgac 


6140 


6159 


SEQ ID NO: 244 


cttttactcagtg ag ccca 




6211 


SEQ ID NO: 245 


teattgatgetttagagat 


6217 


6236 


QPn in no 94R 


aaaaccaagatgttcactc 


6295 


6314 


ecn in no- 9A7 




6357 


6376 


ccn in ncv 9<ir 


tagttgtactggaaaacgt 


6376 


6395 


qe=a in mo- 9A<3 


ggaaaacgtacagagaaa 


g 6386 


6405 


SEQ ID NO' 250 


gaaaacgtacagagaaag 


; 6387 


6406 


SEQ ID NO: 251 


aaagctgaagcacatcaat 


6401 


6420 


SEQ ID NO: 252 


aagctgaagcacatcaata 


6402 


6421 


SEQ ID NO: 253 


tgaagcacatcaatattga 


6406 


6425 


SEQ ID NO: 254 


atcaatattgatcaatttg 


6414 


6433 


SEQ ID NO: 255 


taatgattatctgaattca 


6476 


6495 


SEQ ID NO: 256 


gattatctgaattcattca 


6480 


6499 


SEQ ID NO: 257 


aattgggagagacaagttt 


6498 


6517 


SEQ ID NO: 258 


aaaatagctattgetaata 


6693 


6712 


SEQ ID NO: 259 


aaaattaaaaagtcttgat 


6731 


6750 


SEQ ID NO: 260 


ttgaaaatattgattttaa 


6808 


6827 


SEQ ID NO: 261 


agacatccagcacctagct 


6938 


6957 



SEQ ID NO: 1227 


tcaaattcctggatacact 


9848 9867 1 


5 


SEQ ID NO: 1228 


cctgaccttcacataccag 


8310 8329 1 


5 


SEQ ID NO: 1229 


aagtaaaagaaaattttgc 


10744 10763 1 


5 


SEQ ID NO: 1230 


tgaagtaaaagaaaatttt 


10742 10761 1 


5 


SEQ ID NO: 1231 


ttaaggacttccattctga 


13363 13382 1 


5 


SEQ ID NO: 1232 


agoccatcaatatcattga 


6205 6224 1 


5 


SEQ ID NO: 1233 


tgtttcaactgcctttgtg 


11219 11238 1 


5 


SEQ ID NO: 1234 


tgttttcctatttccaaga 


12835 12854 1 


5 ; 


SEQ ID NO: 1235 


agttattttgctaaacttg 


14043 14062 1 


5 


SEQ ID NO: 1236 


ctgtttttagaggaaacca 


7512 7531 1 


5 


SEQ ID NO: 1237 


tgtatagcaaattcctgtt 


5890 5909 1 


5 


SEQ ID NO: 1238 


gttccttccatgatttccc 


10933 10952 1 


5 


SEQ ID NO: 1239 




11204 11223 1 


5 


SEQ ID NO: 1240 


ctgctaagaaccttactga 


7780 7799 1 


5 


SEQ ID NO: 1241 


cctttcaagcactgactgc 


11746 11765 1 


5 


SEQ ID NO: 1242 


ccaggttttccacaccaga 


8038 8057 1 


5 


SEQ ID NO: 1243 


ctttttcaccaacggagaa 


10838 10857 1 


5 


SEQ ID NO: 1244 


caggaggctttaagttcag 


7599 7618 1 


5 


SEQ ID NO: 1245 


attccttcctttacaattg 


8082 8101 1 


5 


SEQ ID NO: 1246 


gtcagcccagttccttcca 


10924 10943 1 


5 


SEQ ID NO: 1247 


tgggctaaacgtatgaaag 


7827 7846 1 


5 


SEQ ID NO: 1248 


atcttcataagttcaatga 


13174 13193 1 


5 


SEQ ID NO: 1249 


gagtgaaatgctgtttttt 


8630 8649 1 


5 


SEQ ID NO: 1250 


taatgattttcaagttcct 


8294 8313 1 


5 


SEQ ID NO: 1251 


acgttagcctctaagacta 


11928 11947 1 


5 


SEQ ID NO: 1252 


cttttacaattcattttcc 


13014 13033 1 


5 


SEQ ID NO: 1253 


gctttctcttccacatttc 


10052 10071 1 


5 


SEQ ID NO: 1254 


attgatgttagagtgcttt 


6984 7003 1 


5 


SEQ ID NO: 1255 


tattgatgttagagtgctt 


6983 7002 1 


5 


SEQ ID NO: 1256 


tcaaccttaatgattttca 


8287 8306 1 


5 


SEQ ID NO: 1257 


caaagccateactgatgat 


1660 1679 1 


5 


SEQ ID NO: 1258 


tgaaatcattgaaaaatta 


6719 6738 1 


5 


SEQ ID NO: 1259 


tgaagtagctgagaaaatc 


7094 7113 1 


5 


SEQ ID NO: 1260 


aaacattcctttaacaatt 


9488 9507 1 


5 


SEQ ID NO: 1261 


tattgaaaatattgatttt 


6806 6825 1 


5 


SEQ ID NO: 1262 


atcatatccgtgtaatttt 


6757 6776 1 


5 


SEQ ID NO: 1263 


ttaatcttcataagttcaa 


13171 13190 1 


5 


SEQ ID NO: 1264 


agcttggttttgccagtct 


2458 2477 1 


5 
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SEQ ID NO: 262 


caatttcatttgaaagaat 


7021 


7040 


SEQ ID NO: 1265 


attccttcctttacaattg 


8082 8101 1 


5 


SEQ ID NO: 263 


aggttttaatggataaatt 


7174 


7193 


SEQ ID NO: 1266 


aattgttgaaagaaaacct 


13147 13166 1 


5 


SEQ ID NO: 264 


cagaagctaagcaatgtcc 


7233 


7252 


SEQ ID NO: 1267 


ggacaaggcccagaatctg 


12545 12564 1 


5 


SEQ ID NO: 265 


taagataaaagattacttt 


7262 


7281 


SEQ ID NO: 1268 


aaagaaaacctatgcctta 


13155 13174 1 


5 


SEQ ID NO: 266 


aaagattactttgagaaat 


7269 


7288 


SEQ ID NO: 1269 


atttcttaaacattcctft 


9481 9500 1 


5 


SEQ ID NO: 267 


gagaaattagttggattta 


7281 


7300 


SEQ ID NO: 1270 


taaagccattcagtctctc 


12962 12981 1 


5 


SEQ ID NO: 268 


atttattgatgatgctgtc 


7295 


7314 


SEQ ID NO: 1271 


gacatgttgataaagaaat 


7371 7390 1 


5 


SEQ ID NO: 269 


gaattatcttttaaaacat 


7326 


7345 


SEQ ID NO: 1272 


atgtatcaaatggacattc 


7677 7696 1 


5 


SEQ ID NO: 270 


ttaccaccagtttgtagat 


7403 


7422 


SEQ ID NO: 1273 


atctggaaccttgaagtaa 


10731 10750 1 


5 


SEQ ID NO: 271 


ttgcagtgtatetggaaag 


7540 


7559 


SEQ ID NO: 1274 


cttttcacattagatgcaa 


8412 8431 1 


5 


SEQ ID NO: 272 


cattcagcaggaacttcaa 


7691 


7710 


SEQ ID NO: 1275 


ttgaaggacttcaggaatg 


12001 12020 1 


5 


SEQ ID NO: 273 


acacctgattttatagtcc 


7950 


7969 


SEQ ID NO: 1276 


ggactcaaggataacgtgt 


12606 12625 1 


5 


SEQ ID NO: 274 


ggattccatcagttcagat 


7984 


8003 


SEQ ID NO: 1277 


atcttcaatgattatatcc 


13116 13135 1 


5 


SEQ ID NO: 275 


ttgtagaaatgaaagtaaa 


8104 


8123 


SEQ ID NO: 1278 


tttatgattatgtcaacaa 


12352 12371 1 


5 


SEQ ID NO: 276 


ctgaacagtgagctgcagt 


8148 


8167 


SEQ ID NO: 1279 


actggacttetctagtcag 


8801 8820 1 


5 


SEQ ID NO: 277 


aatccaatctcctcttttc 


8399 


8418 


SEQ ID NO: 1280 


gaaaaatgaagtccggatt 


11009 11028 1 


5 


SEQ ID NO: 278 


attttgattttcaagcaaa 


8524 


8543 


SEQ ID NO: 1281 


tttgcaagttaaagaaaat 


14015 14034 1 


5 


SEQ ID NO: 279 


ttttgattttcaagcaaat 


8525 


8544 


SEQ ID NO: 1282 


atttgatttaagtgtaaaa 


9614 9633 1 


5 


SEQ ID NO: 280 


tgattttcaagcaaatgca 


8528 


8547 


SEQ ID NO: 1283 


tgcaagttaaagaaaatca 


14017 14036 1 


5 


SEQ ID NO: 281 


atgctgttttttggaaatg 


8637 


8656 


SEQ ID NO: 1284 


cattggtaggagacagcat 


11195 11214 1 


5 


SEQ ID NO: 282 


tgctgttttttggaaatgc 


8638 


8657 


SEQ ID NO: 1285 


gcattggtaggagacagca 


11194 11213 1 


5 


SEQ ID NO: 283 


aaaaaaatacactggagct 


8698 


8717 


SEQ ID NO: 1286 


agctagagggcctcttttt 


10825 10844 1 


5 


SEQ ID NO: 284 


actggagcttagtaatgga 


8708 


8727 


SEQ ID NO: 1287 


tccactcacatcctccagt 


1281 1300 1 


5 


SEQ ID NO: 285 


cttctggaaaagggtcatg 


8878 


8897 


SEQ ID NO: 1288 


catgaacccctacatgaag 


13751 13770 1 


5 


SEQ ID NO: 286 


ggaaaagggtcatggaaat 


8883 


8902 


SEQ ID NO: 1289 


atttgaaagttcgttttcc 


9274 9293 1 


5 


SEQ ID NO: 287 


gggcctgccccagattctc 


8902 


8921 


SEQ ID NO: 1290 


gagaacattatggaggccc 


9432 9451 1 


5 


SEQ ID NO: 288 


ttctcagatgagggaacac 


8916 


8935 


SEQ ID NO: 1291 


gtgtcttcaaagctgagaa 


12408 12427 1 


5 


SEQ ID NO: 289 


gatgagggaacacatgaat 


8922 


8941 


SEQ ID NO: 1292 


attccagcttccccacatc 


8330 8349 1 


5 


SEQ ID NO: 290 


ctttggactgtccaataag 


8978 


8997 


SEQ ID NO: 1293 


cttatgggatttcctaaag 


11159 11178 1 


5 


SEQ ID NO: 291 


gcatccacaaacaatgaag 


9252 


9271 


SEQ ID NO: 1294 


cttcatctgtcattgatgc 


10219 1023B 1 


5 


SEQ ID NO: 292 


cacaaacaatgaagggaa 


9257 


9276 


SEQ ID NO: 1295 


attccctgaagttgatgtg 


11480 11499 1 


5 


SEQ ID NO: 293 


ccaaaatttctctgctgga 


9407 


9426 


SEQ !D NO: 1296 


tccatcacaaatoctttgg 


9663 9682 1 


5 


SEQ ID NO: 294 


caaaatttctctgctggaa 


9408 


9427 


SEQ ID NO: 1297 


ttccatcacaaatcctttg 


9662 9681 1 


5 


SEQ ID NO: 295 


tctgctggaaacaacgaga 


9417 


9436 


SEQ ID NO: 1298 


tctcaagagttacagcaga 


13221 13240 1 


5 


SEQ ID NO: 296 


ctgctggaaacaacgagas 


9418 


9437 


SEQ ID NO: 1299 


ttctcaagagttacagcag 


13220 13239 1 


5 


SEQ ID NO: 297 


agaacattatggaggccca 


9433 


9452 


SEQ ID NO: 1300 


tgggcctgccccagattct 


8901 8920 1 


5 


SEQ ID NO: 298 


agaagcaaatctggatttc 


9467 


9486 


SEQ ID NO: 1301 


gaaatcttcaatttattct 


13813 13832 1 


5 


SEQ ID NO: 299 


tttctetctatgggaaaaa 


9557 


9576 


SEQ ID NO: 1302 


tttttgcaagttaaagaaa 


14013 14032 1 


5 
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SEQ ID NO: 300 


tcagagcatcaaatccttt 


9704 


9723 


SEQ ID NO: 1303 


aaagaaaatcaggatctga 


14025 14044 1 


5 


SEQ ID NO: 301 


cagaaacaatgcattagat 


9743 


9762 


SEQ ID NO: 1304 


atctatgccatctcttctg 


5625 5644 1 


5 


SEQ ID NO: 302 


tacacattaatcctgccat 


9993 


10012 


SEQ ID NO: 1305 


atggagtctttattgtgta 


14081 14100 1 


5 


SEQ ID NO: 303 


agtcagatattgttgctca 


10186 


10205 


SEQ ID NO: 1306 


tgagaactacgagctgact 


4799 4818 1 


5 


SEQ ID NO: 304 


ggagggtagtcataacagt 


10328 


10347 


SEQ ID NO: 1307 




2726 2745 1 


5 


SEQ ID NO: 305 


caaaagccgaaattccaat 


10396 10415 


SEQ ID NO: 1308 


attg aagtacctacttttg 


8358 8377 1 


5 


SEQ ID NO: 306 


aaaagccgaaattccaatt 


10397 


10416 


SEQ ID NO: 1309 


aattg aagtacctactttt 


8357 8376 1 


5 


SEQ ID NO: 307 


ttcaagcaagaacttaatg 


10428 


10447 


SEQ ID NO: 1310 


cattatggcccttcgtgaa 


13250 13269 1 


5 


SEQ ID NO: 308 


cctcttacttttccattga 


10570 


10589 


SEQ ID NO: 1311 


tcaaaagaagcccaagagg 


12939 12958 1 


5 


SEQ ID NO: 309 


tgaggccaacacttacttg 






SEQ ID NO: 1312 


caagcatctgattgactca 


12668 12687 1 


5 


SEQ ID NO: 310 


cacttacttgaattccaag 


10664 


10683 


SEQ ID NO: 1313 


ctlgaacacaaagtcagtg 


6000 6019 1 


5 


SEQ ID NO: 311 


gaagtaaaagaaaattttg 


10743 


10762 


SEQ ID NO: 1314 


caaaaacattttcaacttc 


5279 5298 1 


5 


SEQ ID NO: 312 


cctggaactctctccatgg 


10874 


10893 


SEQ ID NO: 1315 


ccatttacagatcttcagg 


11364 11383 1 


5 


SEQ ID NO: 313 


agctggatgtaaccaccag 


11176 


11195 


SEQ ID NO: 1316 


ctggattccacatgcagct 


11847 11866 1 


5 


SEQ ID NO: 314 


aaaattccctgaagttgat 


11477 


11496 


SEQ ID NO: 1317 


atcatatccgtgtaatttt 


6757 6776 1 


5 


SEQ ID NO: 315 


cagatggcattgctgcttt 


11605 


11624 


SEQ ID NO: 1318 


aaagctgagaagaaatctg 


12416 12435 1 


5 


SEQ ID NO: 316 




11606 


11625 


SEQ ID NO: 1319 


caaagctgagaagaaatct 


12415 12434 1 


5 


SEQ ID NO: 317 


tgttg aaacagtcctggat 


11834 


11853 


SEQ ID NO: 1320 




13095 13114 1 


5 


SEQ ID NO: 318 


catattcaaaactgagttg 


12221 


12240 


SEQ ID NO: 1321 


caactctctgattactatg 


13623 13642 1 


5 


SEQ ID NO: 319 


aaagatttatcaaaagaag 


12930 


12949 


SEQ ID NO: 1322 


cttcaatttattcttcttt 


13818 13837 1 


5 


SEQ ID NO: 320 


attttccaactaatag aag 


13026 


13045 


SEQ ID NO: 1323 


cttcaaagacttaaaaaat 


8006 8025 1 


5 


SEQ ID NO: 321 


aattatatccaagatgaga 


13089 


13108 


SEQ ID NO: 1324 


tctcttcctccatggaatt 


10471 10490 1 


5 


SEQ ID NO: 322 


ttcagg aag cttctcaaga 


13210 


13229 


SEQ ID NO: 1325 


tcttcataagttcaatgaa 


13175 13194 1 


5 


SEQ ID NO: 323 


ttgagcaatttctgcacag 


13429 


13448 


SEQ ID NO: 1326 


ctgttgaaagatttatcaa 


12924 12943 1 


5 


SEQ ID NO: 324 


ctgatatacatcacggagt 


13704 


13723 


SEQ ID NO: 1327 


actcaatggtgaaattcag 


7457 7476 1 


5 


SEQ ID NO: 325 


acatcacggagttactgaa 


13711 


13730 


SEQ ID NO: 1328 


ttcagaagctaagcaatgt 


7231 7250 1 


5 


SEQ ID NO: 326 


actgcctatattgataaaa 


13874 


13893 


SEQ ID NO: 1329 


ttttggcaagctatacagt 


8372 8391 1 


5 


SEQ ID NO: 327 


aggatggcattttttgcaa 


14003 


14022 


SEQ ID NO: 1330 


ttgcaagcaagtctttcct 


3005 3024 1 


5 


SEQ ID NO: 328 


ttttttgcaagttaaagaa 


14012 


14031 


SEQ ID NO: 1331 


ttctctctatgggaaaaaa 


9558 9577 1 


5 


SEQ ID NO: 329 


tccagaactcaagtcttca 


1619 


1638 


SEQ ID NO: 1332 


tgaaatgctgttttttgga 


8633 8652 3 


4 


SEQ ID NO: 330 


agttagtgaaagaagttct 


1948 


1967 


SEQ ID NO: 1333 


agaatctgtaccaggaact 


12556 12575 3 


4 


SEQ ID NO: 331 


atttacagctctgacaagt 


5427 


5446 


SEQ ID NO: 1334 


acttcagagaaatacaaat 


11401 11420 3 


4 


SEQ ID NO: 332 


gattatctgaattcattca 


6480 


6499 


SEQ ID NO: 1335 


tgaaaccaatgacaaaatc 


7421 7440 3 


4 


SEQ ID NO: 333 


gtgcccttctcggttgctg 


18 


37 


SEQ ID NO: 1336 


cagctgagcagacaggcac 


6031 6050 2 


4 


SEQ ID NO: 334 


attcaagcacctccggaag 


245 


264 


SEQ ID NO: 1337 


cttcataagttcaatgaat 


13176 13195 2 


4 


SEQ ID NO: 335 


gactgctgattcaagaagt 


308 


327 


SEQ ID NO: 1338 


acttcccaactctcaagtc 


13407 13426 2 


4 


SEQ ID NO: 336 


ttgctgcagccatgtccag 


475 


494 


SEQ ID NO: 1339 


ctgggcagctgtatagcaa 


5881 5900 2 


4 


SEQ ID NO: 337 


agaaagatgaacctactta 


547 


566 


SEQ ID NO: 1340 


taagtatgatttcaattct 


10490 10509 2 


4 
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SEQ ID NO: 338 


tgaagactctccaggaact 


1087 


1106 


SEQ ID NO: 1341 


agttcaatgaatttattca 


13183 13202 2 


4 


SEQ ID NO: 339 


atctctcttgccacagctg 


1202 


1221 


SEQ ID NO: 1342 


cagcccagccatttgagat 


9229 9248 2 


4 


SEQ ID NO: 340 


tctctcttgccacagctga 


1203 


1222 


SEQ ID NO: 1343 


tcagcccagccatttgaga 


9228 9247 2 


4 


SEQ ID NO: 341 


tgaggtgtccagccccatc 


1223 


1242 


SEQ ID NO: 1344 


gatgggaaagccgccctca 


5208 5227 2 


4 


SEQ ID NO: 342 


ccagaactcaagtcttcaa 


1620 


1639 


SEQ ID NO: 1345 


ttgaaagcagaacctctgg 


5907 5926 2 


4 


SEQ ID NO: 343 


ctgaaaaagttagtgaaag 


1941 


1960 


SEQ ID NO: 1346 


ctttctcgggaatattcag 


10623 10642 2 


4 


SEQ ID NO: 344 


tttttcccagacagtgtca 


2238 


2257 


SEQ ID NO: 1347 


tgacaggcattttgaaaaa 


9722 9741 2 


4 


SEQ ID NO: 345 


ttttcccagacagtgtcaa 


2239 


2258 


SEQ ID NO: 1348 


ttgacaggcattttgaaaa 


9721 9740 2 


4 


SEQ ID NO: 346 


cattcagaacaagaaaatt 


3395 


3414 


SEQ ID NO: 1349 


aattccaattttgagaatg 


10406 10425 2 


4 


SEQ ID NO: 347 


tgaagagaagattgaattt 


3620 


3639 


SEQ ID NO: 1350 


aaatgtcagctcttgttca 


10894 10913 2 


4 


SEQ ID NO: 348 


tttgaatggaacacaggca 


3636 


3655 


SEQ ID NO: 1351 


tgccagtttgaaaaacaaa 


11807 11826 2 


4 


SEQ ID NO: 349 


ttctagattcgaatatcaa 


4399 


4418 


SEQ ID NO: 1352 


ttgacatgttgataaagaa 


7369 7388 2 


4 


SEQ ID NO: 350 


gattcgaatatcaaattca 


4404 


4423 


SEQ ID NO: 1353 


tgaagtagaccaacaaatc 


7154 7173 2 


4 


SEQ ID NO: 351 


tgcaacgaccaacttgaag 


5075 


5094 


SEQ ID NO: 1354 


cttcaggttccatagtgca 


11376 11395 2 


4 


SEQ ID NO: 352 


ttaagctctcaaatgacat 


5317 


5336 


SEQ ID NO: 1355 


atgttgataaagaaattaa 


7374 7393 2 


4 


SEQ ID NO: 353 


caatttaacaacaatgaat 


6066 


6085 


SEQ ID NO: 1356 


attcaaactgcctatattg 


13868 13887 2 


4 


SEQ ID NO: 354 


tgaatacagccaggacttg 


6080 


6099 


SEQ ID NO: 1357 


caagagcacacggtcttca 


10679 10698 2 


4 


SEQ ID NO: 355 


catcaatattgatcaattt 


6413 


6432 


SEQ ID NO: 1358 


aaattccctgaagttgatg 


11478 11497 2 


4 


SEQ ID NO: 356 


ttgagcatgtcaaacactt 


7051 


7070 


SEQ ID NO: 1359 


aagtaagtgctaggttcaa 


9373 9392 2 


4 


SEQ ID NO: 357 


tgaaggagactattcagaa 


7219 


7238 


SEQ ID NO: 1360 


ttctgcacagaaatattca 


13438 13457 2 


4 


SEQ ID NO: 358 


ttcaggctcttcagaaagc 


7921 


7940 


SEQ ID NO: 1361 


gcttgctaacctctctgaa 


12304 12323 2 


4 


SEQ ID NO: 359 


tccacaaattgaacatccc 


8779 


8798 


SEQ ID NO: 1362 


gggacctaccaagagtgga 


12525 12544 2 


4 


SEQ ID NO: 360 


tgaataccaatgctgaact 


10159 


10178 


SEQ ID NO: 1363 


agttcaatgaatttattca 


13183 13202 2 


4 


SEQ ID NO: 361 


taaactaatagatgtaatc 


12890 12909 


SEQ ID NO: 1364 


gattactatgaaaaattta 


13632 13651 2 


4 


SEQ ID NO: 362 


ttgacctgtccattcaaaa 


13672 


13691 


SEQ ID NO: 1365 


ttttaaaagaaatctteaa 


13805 13824 2 


4 


SEQ ID NO: 363 


gggctgagtgcccttctcg 


11 


30 


SEQ ID NO: 1366 


cgaggccaggccgcagccc 


76 95 1 


4 


SEQ ID NO: 364 


ggctgagtgcccttctcgg 


12 


31 


SEQ ID NO: 1367 


ccgaggccaggccgcagcc 


75 94 1 


4 


SEQ ID NO: 365 


ctgagtgcccttctcggtt 


14 


33 


SEQ ID NO: 1368 


aaccgtgcctgaatctcag 


11549 11568 1 


4 


SEQ ID NO: 366 


tctcggttgctgccgctga 


25 


44 


SEQ ID NO: 1369 


tcagctgacctcatcgaga 


2160 2179 1 


4 


SEQ ID NO: 367 


caggccgcagcccaggag 


c 82 


101 


SEQ ID NO: 1370 


gctctgcagcttcatcctg 


368 387 1 


4 


SEQ ID NO: 368 


gctggcgctgcctgcgctg 


143 


162 


SEQ ID NO: 1371 


cagcacagaccatttcagc 


4244 4263 1 


4 


SEQ ID NO: 369 


tgctgctggcgggcgccag 


169 


188 


SEQ ID NO: 1372 


ctggatgtaaccaccagca 


11178 11197 1 


4 


SEQ ID NO: 370 


ctggtctgtccaaaagatg 


219 


238 


SEQ ID NO: 1373 


catcctgaagaccagccag 


380 399 1 


4 


SEQ ID NO: 371 


ctgagagttccagtggagt 


283 


302 


SEQ ID NO: 1374 


actcaccctggacattcag 


3383 3402 1 


4 


SEQ ID NO: 372 


tccagtggagtccctggga 


291 


310 


SEQ ID NO: 1375 


tcccggagccaaggctgga 


2675 2694 1 


4 


SEQ ID NO: 373 


aggttgagctggaggttcc 


346 


365 


SEQ ID NO: 1376 


ggaaccctctccctcacct 


4728 4747 1 


4 


SEQ ID NO: 374 


tgagctggaggttccccag 


350 


369 


SEQ ID NO: 1377 


ctgggaggcatgatgctca 


9163 9182 1 


4 


SEQ ID NO: 375 


tctgcagcttcatcctgaa 


370 


389 


SEQ ID NO: 1378 


ttcaaatataatcggcaga 


3261 3280 1 


4 
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SEQ ID NO: 376 


gccagtgcaccctgaaaga 


394 


413 


SEQ ID NO: 1379 


tcttccgttctgtaatggc 


5794 5813 1 


4 


SEQ ID NO: 377 


ctctgaggagtttgctgca 


464 


483 


SEQ ID NO: 1380 


tgcaagaatattttgagag 


6340 6359 1 


4 


SEQ ID NO: 378 


aggtatgagctcaagctgg 


492 


511 


SEQ ID NO: 1381 


ccagtttccggggaaacct 


12716 12735 1 


4 


SEQ ID NO: 379 


tcctttacccggagaaaga 


535 


554 


SEQ ID NO: 1382 


tctttttgggaagcaagga 


2219 2238 1 


4 


SEQ ID NO: 380 


catcaagaggggcatcatt 


575 


594 


SEQ ID NO: 1383 


aatggtcaagttcctgatg 


2277 2296 1 


4 


SEQ ID NO: 381 


tcctggttcccccagagac 


601 


620 


SEQ ID NO: 1384 


gtctctgaactcagaagga 


13988 14007 1 


4 


SEQ ID NO: 382 


aagaagccaagcaagtgtt 


622 


641 


SEQ ID NO: 1385 


aacaaataaatggagtctt 


14072 14091 1 


4 


SEQ ID NO: 383 


aagcaagtgttgtttctgg 


630 


649 


SEQ ID NO: 1386 


ccagagccaggtcgagctt 


11042 11061 1 


4 


SEQ ID NO: 384 


tctggataccgtgtatgga 


644 


663 


SEQ ID NO: 1387 


tccatgtcccatttacaga 


11356 11375 1 


4 


SEQ ID NO: 385 


ccactcactttaccgtcaa 


670 


689 


SEQ ID NO: 1388 


ttgattttaacaaaagtgg 


6817 6836 1 


4 


SEQ ID NO: 386 


aggaagggcaatgtggcaa 


693 


712 


SEQ ID NO: 1389 


ttgcaagcaagtctttcct 


3005 3024 1 


4 


SEQ ID NO: 387 


gcaatgtggcaacagaaat 


700 


719 


SEQ ID NO: 1390 


atttccataccccgtttgc 


3480 3499 1 


4 


SEQ ID NO: 388 


caatgtggcaacagaaata 


701 


720 


SEQ ID NO: 1391 


tattcttcttttccaattg 


13826 13845 1 


4 


SEQ ID NO: 389 


tggcaacagaaatatccac 


706 


725 


SEQ ID NO: 1392 


gtggcttcccatattgcca 


1887 1906 1 


4 


SEQ ID NO: 390 


agagacctgggccagtgtg 


729 


748 


SEQ ID NO: 1393 


cacattacatttggtctct 


2930 2949 1 


4 


SEQ ID NO: 391 


tgtgatcgcttcaagccca 


744 


763 


SEQ ID NO: 1394 


tgggaaagccgccctcaca 


5210 5229 1 


4 


SEQ ID NO: 392 


gtgatcgcttcaagcccat 


745 


764 


SEQ ID NO: 1395 


atgggaaagccgccctcac 


5209 5228 1 


4 


SEQ ID NO: 393 


cagcccacttgctctcatc 


776 


795 


SEQ ID NO: 1396 


gatgctgaacagtgagctg 


8144 8163 1 


4 


SEQ ID NO: 394 


gctctcatcaaaggcatga 


786 


805 


SEQ ID NO: 1397 


tcataacagtactgtgagc 


10337 10356 1 


4 


SEQ ID NO: 395 


ccttgtcaaotctgatcag 


811 


830 


SEQ ID NO: 1398 


ctgagtgggtttatcaagg 


12445 12464 1 


4 


SEQ ID NO: 396 


cttgtcaactctgatcagc 


812 


831 


SEQ ID NO: 1399 


gctgagtgggtttatcaag 


12444 12463 1 


4 


SEQ ID NO: 397 


agccatctgcaaggagcaa 


884 


903 


SEQ ID NO: 1400 


ttgcaatgagctcatggct 


3805 3824 1 


4 


SEQ ID NO: 398 


gccatctgcaaggagcaac 


885 


904 


SEQ ID NO: 1401 


gttgcaatgagctcatggc 


3804 3823 1 


4 


SEQ ID NO: 399 


cttcctgcctttctcctac 


908 


927 


SEQ ID NO: 1402 


gtaggaataaatggagaag 


9453 9472 1 


4 


SEQ ID NO: 400 


ctttctcctacaagaataa 


916 


935 


SEQ ID NO: 1403 


ttattgctgaatccaaaag 


13648 13667 1 


4 


SEQ ID NO: 401 


gatcaacagccgcttcttt 


989 


1008 


SEQ ID NO: 1404 


aaagccatcactgatgatc 


1661 1680 1 


4 


SEQ ID NO: 402 


atcaacagccgcttctttg 


990 


1009 


SEQ ID NO: 1405 


caaagccatcactgatgat 


1660 1679 1 


4 


SEQ ID NO: 403 


acagccgcttctttggtga 


994 


1013 


SEQ ID NO: 1406 


tcacaaatcctttggctgt 


9667 9686 1 


4 


SEQ ID NO: 404 


aagatgggcctcgcatttg 


1023 


1042 


SEQ ID NO: 1407 


caaaatagaagggaatctt 


2069 2088 1 


4 


SEQ ID NO: 405 


tgttttgaagactctccag 


1082 


1101 


SEQ ID NO: 1408 


ctggtaactactttaaaca 


5487 5506 1 


4 


SEQ ID NO: 406 


ttgaagactctccaggaac 


1086 


1105 


SEQ ID NO: 1409 


gttcaatgaatttattcaa 


13184 13203 1 


4 


SEQ ID NO: 407 


aactgaaaaaactaaccat 


1102 


1121 


SEQ ID NO: 1410 


atggcattttttgcaagtt 


14006 14025 1 


4 


SEQ ID NO: 408 


ctgaaaaaactaaccatct 


1104 


1123 


SEQ ID NO: 1411 


agattgatgggcagttcag 


4564 4583 1 


4 


SEQ ID NO: 409 


aaaactaaccatctctgag 


1109 


1128 


SEQ ID NO: 1412 


ctcaaagaatgactttttt 


2570 2589 1 


4 


SEQ ID NO: 410 


tgagcaaaatatccagaga 


1124 


1143 


SEQ ID NO: 1413 


tctccagataaaaaactca 


12201 12220 1 


4 


SEQ ID NO: 411 


caataagctggttactgag 


1154 


1173 


SEQ ID NO: 1414 


ctcagatcaaagttaattg 


12265 12284 1 


4 


SEQ ID NO: 412 


tactgagctgagaggcctc 


1166 


1185 


SEQ ID NO: 1415 


gagggtagtcataacagta 


10329 10348 1 


4 


SEQ ID NO: 413 


gcctcagtgatgaagcagt 


1180 


1199 


SEQ ID NO: 1416 


actgttgactcaggaaggc 


12572 12591 1 


4 
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SEQ ID NO: 414 
SEQ ID NO: 415 
SEQ ID NO: 416 
SEQ ID NO: 417 
SEQ ID NO: 418 
SEQ ID NO: 419 
SEQ ID NO: 420 
SEQ ID NO: 421 
SEQ ID NO: 422 
SEQ ID NO: 423 
SEQ ID NO: 424 
SEQ ID NO: 425 
SEQ ID NO: 426 
SEQ ID NO: 427 
SEQ ID NO: 428 
SEQ ID NO: 429 
SEQ ID NO: 430 
SEQ ID NO: 431 
SEQ ID NO: 432 
SEQ ID NO: 433 
SEQ ID NO: 434 
SEQ ID NO: 435 
SEQ ID NO: 436 
SEQ ID NO: 437 
SEQ ID NO: 438 
SEQ ID NO: 439 
SEQ ID NO: 440 
SEQ ID NO: 441 
SEQ ID NO: 442 
SEQ ID NO: 443 
SEQ ID NO: 444 
SEQ ID NO: 445 
SEQ ID NO: 446 
SEQ ID NO: 447 
SEQ ID NO: 448 
SEQ ID NO: 449 
SEQ ID NO: 450 
SEQ ID NO: 451 



agtcacatctctcttgcca 
atctctcttgccacagctg 



tcactttacaagccttggt 

cccttctgatagatgtggt 

gtcacctacctggtggccc 

ccttgtatgcgctgagcca 

gacaaaccctacagggacc 

tgctaattacctgatggaa 



atgaagattacacctattt 
accatggagcagttaactc 



cagaactcaagtcttcaat 

caggctctgcggaaaatgg 

ccaggaggttcttcttcag 

ggttcttcttcagactttc 

tttccttgatgatgcttct 



actttgtggcttcccatat 

gccaatatcttgaactcag 

aatatcttgaactcagaag 



agaattggatatccaagat 
tggatatccaagatctgaa 
atatccaagatctgaaaaa 



tccaactgtcatggacttc 
tcagaaaattctctcggaa 



1196 1215 

1202 1221 

1203 1222 

1210 1229 

1211 1230 
1240 1259 
1324 1343 
1341 1360 
1432 1451 
1472 1491 
1508 1527 
1538 1557 
•I 540 1559 
1552 1571 
1602 1621 
1610 1629 
1621 1640 
1695 1714 
1730 1749 
1736 1755 
1751 1770 
1773 1792 
1788 1807 
1882 1901 
1902 1921 
1905 1924 
1916 1935 

1921 1940 

1922 1941 
1927 1946 

1930 1949 

1931 1950 

1935 1954 

1936 1955 
1942 1961 
1982 2001 
1999 2018 
2044 2063 



SEQ ID NO: 1417 
SEQ ID NO: 1418 
SEQ ID NO: 1419 
SEQ ID NO: 1420 
SEQ ID NO: 1421 
SEQ ID NO: 1422 
SEQ ID NO: 1423 
SEQ ID NO: 1424 
SEQ ID NO: 1425 
SEQ ID NO: 1426 
SEQ ID NO: 1427 
SEQ ID NO: 1428 
SEQ ID NO: 1429 
SEQ ID NO: 1430 
SEQ ID NO: 1431 
SEQ ID NO: 1432 
SEQ ID NO: 1433 
SEQ ID NO: 1434 
SEQ ID NO: 1435 
SEQ ID NO: 1436 
SEQ ID NO: 1437 
SEQ ID NO: 1438 
SEQ ID NO: 1439 
SEQ ID NO: 1440 
SEQ ID NO: 1441 
SEQ ID NO: 1442 
SEQ ID NO: 1443 
SEQ ID NO: 1444 
SEQ ID NO: 1445 
SEQ ID NO: 1446 
SEQ ID NO: 1447 
SEQ ID NO: 1448 
SEQ ID NO: 1449 
SEQ ID NO: 1450 
SEQ ID NO: 1451 
SEQ ID NO: 1452 
SEQ ID NO: 1453 
SEQ ID NO: 1454 



accagatgctgaacagtga 



ttcaggtecatgcaagtoa 
tcttgaacacaaagtcagt 



gagtaaaccaaaacttggt 



ccatgacctccagctcctg 
ctgaaatacaatgctctgg 
gaaaaacttggaaacaacc 



cagcatgcctagtttctcc 

tcaatatcaaaagcccagc 

atatctggaaccttgaagt 



cttccattctgaatatatt 



tcttcaatttattcttctt 

atcttcaatttattcttct 

ttcacataccagaattcca 

tttttaaccagtcagatat 

ctttttaaccagtcagata 

ctaaattcccatggtcttg 

actaaattcccatggtctt 

tctttctcgggaatattca 



8858 8877 1 4 

2161 2180 1 4 

2160 2179 1 4 

13955 13974 1 4 

11240 11259 1 4 

8140 8159 1 4 

10816 10835 1 4 

3431 3450 1 4 

5578 5597 1 4 

12347 12366 1 4 

9930 9949 1 4 

10909 10928 1 4 

5999 6018 1 4 

8110 8129 1 4 

9016 9035 1 4 

13719 13738 1 4 

1925 1944 1 4 

2477 2496 1 4 

5511 5530 1 4 

4431 4450 1 4 

6885 6904 1 4 

9944 9963 1 4 

12037 12056 1 4 

10729 10748 1 4 

13992 14011 1 4 

13370 13389 1 4 

7265 7284 1 4 

13817 13836 1 4 

13816 13835 1 4 

8317 8336 1 4 

10177 10196 1 4 

10176 10195 1 4 

4965 4984 1 4 

4964 4983 1 4 

10622 10641 1 4 

13937 13956 1 4 

9493 9512 1 4 

8433 8452 1 4 
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SEQ ID NO: 452 
SEQ ID NO: 453 


agcctcagccaaaatagaa 


2057 
2060 


2076 
2079 


SEQ ID NO: 1455 t 
SEQ ID NO: 1456 t 


SEQ ID NO: 454 


atcttatatttgatccaaa 


2083 


2102 


SEQ ID NO: 1457 t 


SEQ ID NO: 455 


tcttatatttgatcoaaat 


2084 


2103 


SEQ ID NO: 1458 e 


SEQ ID NO: 456 
SEQ ID NO: 457 


cttcctaaagaaagcatgc 
ctaaagaaagcatgctgaa 


2109 
2113 


2128 
2132 


SEQ ID NO: 1459 c 
SEQ ID NO: 1460 t 


SEQ ID NO: 458 
SEQ ID NO: 459 


taaagaaagcatgctgaaa 
gagattggcttggaaggaa 


2114 
2175 


2133 
2194 


SEQ ID NO: 1461 t 
SEQ ID NO: 1462 t 


SEQ ID NO: 460 


ctttgagccaacattggaa 


2198 


2217 


SEQ ID NO: 1463 t 


SEQ ID NO: 461 


cagacagtgtcaacaaagc 


2245 


2264 


SEQ ID NO: 1464 c 


SEQ ID NO: 462 


cagtgtcaacaaagctttg 


2249 


2268 


SEQ ID NO: 1465 c 


SEQ ID NO: 463 


agtgtcaacaaagctttgt 


2250 


2269 


SEQ ID NO: 1466 £ 


SEQ ID NO: 464 


ctgatggtgtctctaaggt 


2290 


2309 


SEQ ID NO: 1467 i 


SEQ ID NO: 465 


tgatggtgtctctaaggtc 


2291 


2310 


SEQ ID NO: 1468 J 


SEQ ID NO: 466 
SEQ ID NO: 467 




2343 
2387 


2362 
2406 


SEQ ID NO: 1469 c 
SEQ ID NO: 1470 c 


gaagctgattaaagatttg 


SEQ ID NO: 468 


aaagatttgaaatccaaag 


2397 


2416 


SEQ ID NO: 1471 c 


SEQ ID NO: 469 


gatgggtgcccgcactctg 


2510 


2529 


SEQ ID NO: 1472 c 


SEQ ID NO: 470 


gggatcccccagatgattg 


2532 


2551 


SEQ ID NO: 1473 c 


SEQ ID NO: 471 


ttttcttcactacatcttc 


2585 


2604 


SEQ ID NO: 1474 c 


SEQ ID NO: 472 


tcttcactacatcttcatg 


2588 


2607 


SEQ ID NO: 1475 c 


SEQ ID NO: 473 


tacatcttcatggagaatg 


2595 


2614 


SEQ ID NO: 1476 c 


SEQ ID NO: 474 


ttcatggagaatgcctttg 


2601 


2620 


SEQ ID NO: 1477 c 


SEQ ID NO: 475 


tcatggagaatgcctttga 


2602 


2621 


SEQ ID NO: 1478 t 


SEQ ID NO: 476 


tttgaactccccactggag 


2616 


2635 


SEQ ID NO: 1479 c 


SEQ ID NO: 477 
SEQ ID NO: 478 
SEQ ID NO: 479 


ttgaactccccactggagc 


2617 
2618 
2627 


2636 
2637 


SEQ ID NO: 1480 $ 
SEQ ID NO: 1481 c 
SEQ ID NO: 1482 < 


tgaactccccactggagct 
cactggagctggattacag 


2646 


SEQ ID NO: 480 


actggagctggattacagt 


2628 


2647 


SEQ ID NO: 1483 i 


SEQ ID NO: 481 


agttgcaaatatcttcatc 


2644 


2663 


SEQ ID NO: 1484 c 


SEQ ID NO: 482 


gttgcaaatatcttcatct 


2645 


2664 


SEQ ID NO: 1485 


SEQ ID NO: 483 


aaatatcttcatctggagt 


2650 


2669 


SEQ ID NO: 1486 < 


SEQ ID NO: 484 


taaaactggaagtagccaa 


2695 


2714 


SEQ ID NO: 1487 t 


SEQ ID NO: 485 


ggctgaactggtggcaaaa 


2720 


2739 


SEQ ID NO: 1488 t 


SEQ ID NO: 486 


tgtggagtttgtgacaaat 


2750 


2769 


SEQ ID NO: 1489 


SEQ ID NO: 487 


ttgtgacaaatatgggcat 


2758 


2777 


SEQ ID NO: 1490 


SEQ ID NO: 488 


atgaacaccaacttcttcc 


2811 


2830 


SEQ ID NO: 1491 


SEQ ID NO: 489 


cttccacgagtcgggtctg 


2825 


2844 


SEQ ID NO: 1492 



tattctatccaagattggg 



gcatggcattatgatgaag 
ttcagggtgtggagtttag 



ttccctccattaagttctc 
ttccaatgaccaagaaaag 



acaagaatacgtctacact 
acctcggaacaatcctcag 
gacctgcgcaacgagatca 




catggcattatgatgaaga 
cattatggaggcccatgta 
caaaatcaactttaatgaa 




ttttcttttcagcccagcc 



atgcgtctaccttacacaa 
ggaagctgaagtttatcat 
cagagctatcactgggaag 



7812 7831 1 4 

7814 7833 1 4 

11813 11832 1 4 

14011 14030 1 4 

3606 3625 1 4 
5686 5705 1 4 
9482 9501 1 4 
11701 11720 1 4 
11060 11079 1 4 
6134 6153 1 4 
9849 9868 1 4 
4351 4370 1 4 
3325 3344 1 4 
8823 8842 1 4 
6788 6807 1 4 
5279 5298 1 4 
7606 7625 1 4 
7975 7994 1 4 
9075 9094 1 4 
10374 10393 1 4 

3607 3626 1 4 
9437 9456 1 4 
6599 6618 1 4 
13108 13127 1 4 
9834 9853 1 4 
9833 9852 1 4 
9832 9851 1 4 
9336 ' 9355 1 4 
9335 9354 1 4 
6591 6610 1 4 
6590 6609 1 4 
13996 14015 1 4 
7592 7611 1 4 
9220 9239 1 4 
8530 8549 1 4 
9513 9532 1 4 
2869 2888 1 4 
5227 5246 1 4 
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SEQ ID NO: 490 


gagtcgggtctggaggctc 


2832 


2851 


SEQ ID NO: 1493 


gagcttactggacgaactc 


6132 


6151 1 


4 


SEQ ID NO: 491 


cctaaaagctgggaagctg 


2858 


2877 


SEQ ID NO: 1494 


cagcctccccagccgtagg 


12112 12131 1 


4 


SEQ ID NO: 492 


agctgggaagctgaagttt 


2864 


2883 


SEQ ID NO: 1495 


aaactgttaatttacagct 


5455 


5474 1 


4 


SEQ ID NO: 493 


ccagattagagctggaact 


3106 


3125 


SEQ ID NO: 1496 


agtttccggggaaacctgg 


12718 12737 1 


4 


SEQ ID NO: 494 


ggataccctgaagtttgta 


3200 


3219 


SEQ ID NO: 1497 


tacagtattctgaaaatcc 


8385 


8404 1 


4 


SEQ ID NO: 495 




3244 


3263 


SEQ ID NO: 1498 


aatgagctcatggcttcag 


3809 


3828 1 


4 


SEQ ID NO: 496 


tgtccagtgaagtccaaat 


3289 


3308 


SEQ ID NO: 1499 


attttgagaggaatcgaca 


6349 


6368 1 


4 


SEQ ID NO: 497 


aattccggattttgatgtt 


3305 


3324 


SEQ ID NO: 1500 


aacacatgaatcacaaatt 


8930 


8949 1 


4 


SEQ ID NO: 498 


ttccggattttgatgttga 


3307 


3326 


SEQ ID NO: 1501 


tcaaaacgagcttcaggaa 


13199 13218 1 


4 


SEQ ID NO: 499 


cggaacaatcctcagagtt 


3329 


3348 


SEQ ID NO: 1502 


aacttgtacaactggtccg 


4203 


4222 1 


4 


SEQ ID NO: 500 


tcctcagagttaatgatga 


3337 


3356 


SEQ ID NO: 1503 


tcatcaattggtfacagga 


7585 


7604 1 


4 


SEQ ID NO: 501 


ctcaccctggacattcaga 


3384 


3403 


SEQ ID NO: 1504 


tctgcagaacaatgctgag 


12431 


12450 1 


4 


SEQ ID NO: 502 


cattcagaacaagaaaatt 


3395 


3414 


SEQ ID NO: 1505 


aattgactttgtagaaatg 


8096 


8115 1 


4 


SEQ ID NO: 503 


actgaggtcgccctcatgg 


3414 


3433 


SEQ ID NO: 1506 


ccatgcaagtcagcccagt 


10916 10935 1 


4 


SEQ ID NO: 504 


ttatttccataccccgttt 


3478 


3497 


SEQ ID NO: 1507 


aaactgcctatattgataa 


13872 


13891 1 


4 


SEQ ID NO: 505 


gtttgcaagcagaagccag 


3493 


3512 


SEQ ID NO: 1508 


ctggacttctcttcaaaac 


5400 


5419 1 


4 


SEQ ID NO: 506 


tttgcaagcagaagccaga 


3494 


3513 


SEQ ID NO: 1509 


tctgggtgtcgacagcaaa 


5264 


5283 1 


4 


SEQ ID NO: 507 


ttgcaagcagaagccagaa 


3495 


3514 


SEQ ID NO: 1510 


ttctgggtgtcgacagcaa 


5263 


5282 1 


4 


SEQ ID NO: 508 


ctgcttctccaaatg g act 


3546 


3565 


SEQ ID NO: 1511 


agtcaagattgatgggcag 


4559 


4578 1 


4 


SEQ ID NO: 509 


tgctacagcttatg g ctcc 


3569 


3588 


SEQ ID NO: 1512 


ggaggctttaagttcagca 


7601 


7620 1 


4 


SEQ ID NO: 510 


acagcttatggctccacag 


3573 


3592 


SEQ ID NO: 1513 


ctgtatagcaaattcctgt 


5889 


5908 1 


4 


SEQ ID NO: 511 


tttccaagagggtggcatg 


3592 


3611 


SEQ ID NO: 1514 


catggacttcttctggaaa 


8869 


8888 1 


4 


SEQ ID NO: 512 


ccaagagggtggcatggca 


3595 


3614 


SEQ ID NO: 1515 


tgcccagcaagcaagttgg 


9353 


9372 1 


4 


SEQ ID NO: 513 


gtggcatggcattatgatg 


3603 


3622 


SEQ ID NO: 1516 


catccttaacaccttccac 


8063 


8082 1 


4 


SEQ ID NO: 514 


tgatgaagagaagattgaa 


3617 


3636 


SEQ ID NO: 1517 


ttcactgttcctgaaatca 


7863 


7882 1 


4 


SEQ ID NO: 515 


gaagagaagattgaatttg 


3621 


3640 


SEQ ID NO: 1518 


caaaaacattttcaacttc 


5279 


5298 1 


4 


SEQ ID NO: 516 


gagaagaitgaatttgaat 


3624 


3643 


SEQ ID NO: 1519 


attcataatcccaactctc 


8270 


8289 1 


4 


SEQ ID NO: 517 


tttgaatggaacacaggca 


3636 


3655 


SEQ ID NO: 1520 


tgcctttgtgtacaccaaa 


11228 11247 1 


4 


SEQ ID NO: 518 


aggcaccaatgtagatacc 


3650 


3669 


SEQ ID NO: 1521 


ggtaacctaaaaggagcct 


5583 


5602 1 


4 


SEQ ID NO: 519 . 


caaaaaaatgacttccaat 


3668 


3687 


SEQ ID NO: 1522 


attgaagtacctacttttg 


8358 


8377 1 


4 


SEQ ID NO: 520 


aaaaaaatgacttccaatt 


3669 


3688 


SEQ ID NO: 1523 


aattgaagtacctactttt 


8357 


8376 1 


4 


SEQ ID NO: 521 




3670 


3689 


SEQ ID NO: 1524 


aaatccaatctcctctttt 


8398 


8417 1 


4 


SEQ ID NO: 522 


cagagtccctcaaacagac 


3752 


3771 


SEQ ID NO: 1525 


gtctgtgggattccatctg 


4082 


4101 1 


4 


SEQ ID NO: 523 


aaattaatagttgcaatga 


3795 


3814 


SEQ ID NO: 1526 


tcataagttcaatgaattt 


13176 


13197 1 


4 


SEQ ID NO: 524 


ttcaacctccagaacatgg 


3891 


3910 


SEQ ID NO: 1527 


ccattgaccagatgctgaa 


8134 


8153 1 


4 


SEQ ID NO: 525 


tgggattgccagacttcca 


3907 


3926 


SEQ ID NO: 1528 


tggaaatgggcctgcccca 


8895 


8914 1 


4 


SEQ ID NO: 526 
SEQ ID NO: 527 


cagtttgaaaattgagatt 
aaaaattgagattcctttg 


3986 
3992 


4005 
4011 


SEQ ID NO: 1529 
SEQ ID MO: 1530 


aatcacaactcctccactg 


9533 
13686 


9552 1 
13705 1 


4 
4 
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ccn in MO 


tttg ccttttg g tg g ca 3 a 






ecn m MO* iR'-M 
OtU IU NU. IOO I 


tttgagaggaatcgacaaa 


OOO 1 DO/U i 


4 


ucu iu i\iu. ozy 


ctecagagatctaaagatg 


4028 


4047 


ecn m Mrv 
OtU IU l\U. IOOZ 


catcaattggttacaggag 


7586 7605 1 


4 


ocn in mo Kurt 
otu IU NU. 00U 


tctaaagatgttagagact 






ecn in MA- •it-*.'}'* 
OtU IU NU. IOOO 


ag tc cttc atg tcccta g a 


J UU<£0 1 UU*Kt 1 




SEQ ID NO* 531 


ctgtgggattccatctgcc 


4084 


4103 


SEQ ID NO* 1534 


ggcattttgaaaaaaacag 


9727 9746 1 


4 




atctgccatctcgagagtt 






ecn in Mrv -ir^r 

OtU IU NU. 1 0*30 




8548 8567 1 




SEQ ID NO' 533 


tctcgagagttccaagtco 


4104 


4123 


ecn in Mrv -i^fi 

otu IU [\)U. IOOO 


ggacattcctctagcgaga 


8207 8226 1 


4 


SEQ ID NO: 534 


agtccctacttttaccatt 


4118 


4137 


SEQ ID NO: 1537 


aatgaatacagccaggact 


cn7n RAQ7 *i 




SEQ ID NO: 535 


acttttaccattcccaagt 


4125 


4144 


SEQ ID NO: 1538 


actttgtagaaatgaaagt 


O lU 1 Ol^iU 1 




SEQ ID NO: 536 


cattcccaagttgtatcaa 


4133 


4152 


SEQ ID NO: 1539 


ttgaaggacttcaggaatg 


12001 12020 1 




ecn in MO KQ7 
OtU IU NU, 00/ 


accacatgaaggctgactc 


4276 




ecn in Mrv ^ra(\ 

OtU IU NU. I04U 


gagtaaaccaaaacttggt 


9016 9035 1 




SEQ ID NO: 538 


tttcctacaatgtgcaagg 


4309 


4328 


SEQ ID NO: 1541 


cctttaacaattcctgaaa 






ecn in wn* rug 
otu iu wu. ooy 


ctggagaaacaacatatga 






ecn m Kin* ^CAO 
OtU IU NU. 104Z 


tcattctgggtctttccag 


1 1 0'7 1 1 046 1 




SEQ ID NO* 540 


atcatgtg atg ggtctcta 


4370 


4389 


OtU IU l>lU. 1 040 


tagaattacagaaaatgat 


6557 6576 1 




ecn in mo c/M 
otU IU NU. 04 I 


catgtgatgggtctctacg 


4372 


4391 


ecn m MO ~\IZAA 
OtU iU NU. 1044 


cgtaggcaccgtgggcatg 


121^5 12144 1 




ecn in mo rao 

OtU IU NU. 04^1 


ttctagattcgaatatcaa 


4399 




qca m MO 4 CZAR 

OtU IU l\U. I04D 


ttgatgatgctgtcaagaa 


7300 7319 1 


4 


SEQ ID NO: 543 


tggggaccacagatgtctg 


4491 


4510 


SEQ ID NO: 1546 


g g 




* 


ocn in Kin- k.a/\ 
otU IU NU. 044 


ctaacactggccggctcaa 


4636 


4655 


ecn m Mrv *\ra7 

OtU IU NU. 1 04A 


ttgaggctattgatgttag 


6976 6995 1 




ecn m mo r/a 
OtU IU NU. 040 


taacactggccggctcaat 


4637 


4656 


ecn in Mn- -iRA& 

OtU IU l\U. I040 


attg a g g ctattg atg tta 


6975 6994 1 




ocn in mo RyiR 

OtU IU NU. 04O 


aacactggccggctcaatg 






SEQ ID NO' 1549 


cattgaggctattgatgtt 


6974 6993 1 




ecn in mo 017 
OtU IU NU. 04/ 


ctggccggctcaatggaga 






ocn in MO -IRRA 
OtU IU NU. !00U 


tctccatctgcgctaccag 


12065 12084 1 


4 


QPO in MO* 
OtU IU IMU. 04o 


agataacaggaagatatga 


4705 


4724 


SEQ ID NO - 1551 


tcatctcctttcttcatct 


10202 10221 1 


4 


OtU IU NU. 04y 


tccctcacctccacctctg 






ecn in mo -ir^q 
OtzU IU NU. IDOz 


cagatatatatctcaggga 


8176 8195 1 


4 


ecn in Mrv iwn 

OtU IU NU. oou 


agctgactttaaaatctga 


4810 


4829 


SEQ ID NO" 1553 


tcaggctcttcagaaagct 


7922 7941 1 


4 


ecn in mo- kk-i 
OtU IU NU. oon 


ctgactttaaaatctgaca 


4812 


4831 


ecn m MO 4RRA 
OtU IU NU. 1004 


g caaga aaacaa cag 


8732 8751 1 




OtU IU NU. ooz 


caagatggatatgaccttc 






SEQ ID NO" 1555 


gaagtagtactgcatcttg 


6835 6854 1 


4 


SEQ ID NO: 553 


gctgcgttctgaatatcag 






ecn in mo ^czRa. 
OtU IU NU. EOOo 


ctgagtcccagtgcccagc 


yo*f^ yoo i i 


4 


ocn m ma- RR/i 

OtU IU NU. 004 


cgttctgaatatcaggctg 


4905 


4924 


SEQ ID NO' 1557 


cagcaagtacctgagaacg 


8603 8622 1 


4 


ocn in Mrv ckc 
OtU IU NU. 000 


aattcccatggtcttgagt 


4968 


4987 


ecn m mo men 
otU IU NU. IOOO 


actcagatcaaagttaatt 


12264 12283 1 


4 


ecn in Kin- tzi^ei 
OtU IU NU. 000 


tggtcttgagttaaatgct 






SEQ ID NO* 1559 


agcacagtacgaaaaacca 


10801 10820 1 


4 


SEQ ID NO: 557 


ettgagttaaatgetgaca 


4980 




SEQ ID NO: 1560 


tgtccctagaaatctcaag 


1 UU04 1 UUDO I 




ocn m Kin- *;Rfi 
OtU IU NU. 000 


ttgagttaaatgctg acat 




5000 


SEQ ID NO" 1561 


atgtccctagaaatctcaa 


10033 10052 1 




ocn in mh- rro 
otu iu nu. ooy 


tgagttaaatgctgacatc 


4982 


5001 


ecn in mo ^czro 

OtU IU l\U. IOO£ 




4725 4744 1 




SEQ ID NO* 560 




5086 


5105 


SEQ ID NO: 1563 


aJgaTartc^tcIaagt 


12259 12278 1 


4 


SEQ ID NO: 561 


agtgtagtctcctggtgct 


5092 


5111 


SEQ ID NO: 1564 


agcagccagtggcaccact 


12506 12525 1 


4 


SEQ ID NO: 562 


gtgctggagaatgagctga 


5106 


5125 


SEQ ID NO: 1565 


tcagccaggtttatagcac 


7726 7745 1 


4 


SEQ ID NO: 563 


ctggggcatctatgaaatt 


5143 


5162 


SEQ ID NO: 1566 


aatttctgattaccaccag 


13571 13590 1 


4 


SEQ ID NO: 564 


atggccgcttcagggaaca 


5170 


5189 


SEQ ID NO: 1567 


tgttttttggaaatgccat 


8641 8660 1 


4 


SEQ ID NO: 565 


ttcagtctggatgggaaag 


5199 


5218 


SEQ ID NO: 1568 


ctttgacaggcattttgaa 


9719 9738 1 


4 
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SEQ ID NO: 566 


ccatgattctgggtgtcga 


5257 


5276 


SEQ ID NO: 1569 


tcgatgcacatacaaatgg 


5830 5849 1 


4 


SEQ ID NO: 567 


aaaacattttcaacttcaa 


5281 




SEQ ID NO' 1570 


ttgatgttag ® 9*9ctttt 


OsOO / UU*r I 




SEQ ID NO: 568 


cttaagctctcaaatgaca 


5316 


5335 


SEQ ID NO: 1571 




7247 7266 1 




SEQ ID NO: 569 


ttaagctctcaaatgacat 


5317 


5336 


SEQ ID NO: 1572 


afg^ctecaacaagttaa 


7246 7265 1 


4 


SEQ ID NO: 570 


catgatgggctcatatgct 


5333 


5352 


SEQ ID NO: 1573 


39 catctttg gctcacatg 


7616 7635 1 


4 


SEQ ID NO: 571 


igggctcatatgctgaaat 


5338 




SEQ ID NO' 1574 


atttatcaaaagaagccca 


•1 OO'ZA 'iOQR'^ A 

i ^»o*f i ,£yo«3 i 


4 


SEQ ID NO: 572 


actggacttctcttcasaa 




5418 


SEQ ID NO: 1575 




oo t £. ooy i i 




SEQ ID NO: 573 


acttctcttcaaaacttga 


5404 


5423 


SEQ ID NO: 1576 


StTgTa^agTcatgt 


6496 6515 1 




SEQ ID NO: 574 


ctgacaagttttataagca 


5437 


5456 


SEQ ID NO: 1577 


tg ctttgtg ag tttatcag 


9685 9704 1 




SEQ ID NO: 575 


aagttttataagcaaactg 


5442 




SEQ ID NO' 1578 


cagtcatgtagaaaaactt 


AA11 AAAf\ 1 




SEQ ID NO: 576 


ctgttaatttacagctaca 




5477 


SEQ ID NO: 1579 


tgtactggaaaacgtacag 


OoOLF DOSS I 




SEQ ID NO: 577 


ttacagctacagccctatt 


5466 


5485 


SEQ ID NO: 1580 


aatattgatcaatttgtaa 


6417 6436 1 




SEQ ID NO: 578 


tctggtaactactttaaac 






SEQ ID NO' 1581 


g gaaaaacaaagcaga 


11812 11831 1 




SEQ ID NO: 579 


tttaaacagtgacctgaaa 


5498 


5517 


SEQ ID NO: 1582 




7024 7043 1 


4 


SEQ ID NO: 580 


ttaaacagtgacctgaaat 


5499 


5518 


SEQ ID NO: 1583 


atttLa^aaettoa 
c caagaac aa 


10426 10445 1 




SEQ ID NO: 581 


cagtgacctgaaatacaat 






SEQ ID NO: 1584 


attggcgtggagcttactg 


6123 6142 1 


4 


SEQ ID NO: 582 




5576 


5595 


SEQ ID NO: 1585 


ttttgctggagaagccaca 


lu/O/ IVl l\> I 


4 


SEQ ID NO: 583 


ttatcagcaagctataaag 


5649 


5668 


SEQ ID NO: 1586 




12756 12775 1 




SEQ ID NO: 584 


ggttcagggtgtggagttt 


5684 


5703 


SEQ ID NO: 1587 


aaacacctaagagtaaacc 


9006 9025 1 


4 


SEQ ID NO: 585 


attcagactcactgcattt 


5767 


5786 


SEQ ID NO: 1588 




8429 8448 1 


4 


SEQ ID NO: 586 


ttcagactcactgcatttc 


5768 


5787 


SEQ ID NO: 1589 


gaaatattatgaacttgaa 


13304 13323 1 




SEQ ID NO: 587 


tacaaatggcaatgggaaa 


5840 


5859 


SEQ ID NO: 1590 




11168 11187 1 


4 


SEQ ID NO: 588 


gctgtatagcaaattcctg 


5888 


5907 


SEQ ID NO: 1591 


Tg 00 taTarlaTtea c 
caggcca gcaag cage 


10911 10930 1 




SEQ ID NO: 589 


tgagcagacaggcacctgg 


6035 


6054 


SEQ ID NO: 1592 




8333 8352 1 


4 


SEQ ID NO: 590 


ggcacctggaaactcaaga 


6045 


6064 


SEQ ID NO: 1593 


te7crtgtttcaact a cc Ca 


11213 11232 1 


4 


SEQ ID NO: 591 


tgaatacagccaggacttg 


6080 




SEQ ID NO: 1594 


caagtaagtgctaggttca 


9372 9391 1 


4 


SEQ ID NO: 592 




6081 


6100 


SEQ ID NO: 1595 


ccaa cactta cttg aattc 


10660 10679 1 


4 


SEQ ID NO: 593 


rggacgaadc^ggctga 9 


6139 


6158 


SEQ ID NO: 1596 




7931 7950 1 




SEQ ID NO: 594 


ttttactcagtgagcccat 


6193 


6212 


SEQ ID NO: 1597 




8870 8889 1 




SEQ ID NO: 595 


gatgagagatgccgttgag 


6233 


6252 


SEQ ID NO: 1598 


ctcatctcctttcttcatc 


10201 10220 1 


4 


SEQ ID NO: 596 


aattgttgcttttgtaaag 


6269 


6288 


SEQ ID NO: 1599 


cttttctaaacttgaaatt 


9056 9075 1 


4 


SEQ ID NO: 597 


cttttgtaaagtatgataa 


6277 


6296 


SEQ ID NO: 1600 


ttatgaacttgaagaaaag 


13310 13329 1 


4 


SEQ ID NO: 598 


tttgtaaagtatgataaaa 


6279 


6298 


SEQ ID NO: 1601 


ttttcacattagatgcaaa 


8413 8432 1 


4 


SEQ ID NO: 599 


tccattaacctcccatttt 


6312 


6331 


SEQ ID NO: 1602 


aaaattgatgatatctgga 


10719 10738 1 


4 


SEQ ID NO: 600 


ccattaacctcccattttt 


6313 


6332 


SEQ ID NO: 1603 


aaaagggtcatggaaatgg 


8885 8904 1 


4 


SEQ ID NO: 601 


cttgcaagaatattttgag 


6338 


6357 


SEQ ID NO: 1604 


ctcaattttgattttcaag 


8520 8539 1 


4 


SEQ ID NO: 602 


agaatattttgagaggaat 


6344 


6363 


SEQ ID NO: 1605 


attccctccattaagttct 


11700 11719 1 


4 


SEQ ID NO: 603 


attatagttgtactggaaa 


6372 


6391 


SEQ ID NO: 1606 




10427 10446 1 


4 
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SEQ ID NO: 604 


gaagcacatcaatattgat 


6407 


6426 


SEQ ID NO: 1607 


atcagttcagataaacttc 


7991 


8010 1 


4 


SEQ ID NO: 605 


acatcaatattgatcaatt 


6412 


6431 


SEQ ID NO: 1608 


aattccctgaagttgatgt 


11479 11498 1 


4 


SEQ ID NO: 606 


gaaaactcccacagcaagc 6457 


6476 


SEQ ID NO: 1609 


gctttctcttccacatttc 


10052 10071 1 


4 


SEQ ID NO: 607 


ctgaattcattcaattggg 


6486 


6505 


SEQ ID NO: 1610 


cccatttacagatcttcag 


11363 11382 1 


4 


SEQ ID NO: 608 


tgaattcattcaattggga 


6487 


6506 


SEQ ID NO: 1611 


tcccatttacagatcttca 


11362 


11381 1 


4 


SEQ ID NO: 609 


aactgactgctctcacaaa 


6532 


6551 


SEQ ID NO: 1612 


tttgaggattccatcagtt 


7979 


7998 1 


4 


SEQ ID NO: 610 


aaaagtatagaattacaga 


6550 


6569 


SEQ ID NO: 1613 


tctggctccctcaactttt 


9042 


9061 1 


4 


SEQ ID NO: 611 


atcaactttaatgaaaaac 


6603 


6622 


SEQ ID NO: 1614 


gtttattgaaaatattgat 


6803 


6822 1 


4 


SEQ ID NO: 612 


tgatttgaaaatagctatt 


6686 


6705 


SEQ ID NO: 1615 


aatattattgatgaaatca 


6708 


6727 1 


4 


SEQ ID NO: 613 


atltgaaaatagctattgc 


6688 


6707 


SEQ ID NO: 1616 


gcaagaacttaatggaaat 


10433 


10452 1 


4 


SEQ ID NO: 614 


attgctaatattattgatg 


6702 


6721 


SEQ ID NO: 1617 


catcacactgaataccaat 


10151 


10170 1 


4 


SEQ ID NO: 615 


gaaaaattaaaaagtcttg 


6729 


6748 


SEQ ID NO: 1618 


caagagcttatgggatttc 


11153 11172 1 


4 


SEQ ID NO: 616 


actatcatatccgtgtaat 


6754 


6773 


SEQ ID NO: 1619 


attactttgagaaattagt 


7273 


7292 1 


4 


SEQ ID NO: 617 


tattgattttaacaaaagt 


6815 


6834 


SEQ ID NO: 1620 


acttgacttcagagaaata 


11396 


11415 1 


4 


SEQ ID NO: 618 


ctgcagcagcttaagagac 


6906 


6925 


SEQ ID NO: 1621 


gtcttcagtgaagctgcag 


10691 


10710 1 


4 


SEQ ID NO: 619 


aaaacaacacattgaggct 


6965 


6984 


SEQ ID NO: 1622 


agcctcacctcttactttt 


10563 10582 1 


4 


SEQ ID NO: 620. 


ttgagcatgtcaaacactt 


7051 


7070 


SEQ ID NO: 1623 


aagtagctgagaaaatcaa 


7096 


7115 1 


4 


SEQ ID NO: 621 


tttgaagtagctgagaaaa 


7092 


7111 


SEQ ID NO: 1624 


ttttcacattagatgcaaa 


8413 


8432 1 


4 


SEQ ID NO: 622 


ttagtagagttggcccacc 


7191 


7210 


SEQ ID NO: 1625 


ggtggactcttgctgctaa 


7768 


7787 1 


4 


SEQ ID NO: 623 


tgaaggagactattcagaa 


7219 


7238 


SEQ ID NO: 1626 


ttctcaattttgattttca 


8518 


8537 1 


4 


SEQ ID NO: 624 


gagactattcagaagctaa 


7224 


7243 


SEQ ID NO: 1627 


ttagccacagctctgtctc 


10293 


10312 1 


4 


SEQ ID NO: 625 


aattagttggatttattga 


7285 


7304 


SEQ ID NO: 1628 


tcaagaagcttaatgaatt 


7312 


7331 1 


4 


SEQ ID NO: 626 


gcttaatgaattatctttt 


7319 


7338 


SEQ ID NO: 1629 


aaaacgagcttcaggaagc 


13201 


13220 1 


4 


SEQ ID NO: 627 


ttaacaaattccttgacat 


7357 


7376 


SEQ ID NO: 1630 


atgtcctacaacaagttaa 


7246 


7265 1 


4 


SEQ ID NO: 628 


aaattaaagtcatttgatt 


7386 


7405 


SEQ ID NO: 1631 


aatcctttgacaggcattt 


9715 


9734 1 


4 


SEQ ID NO: 629 


gactcaatggtgaaattca 


7456 


7475 


SEQ ID NO: 1632 


tgaaattcaatcacaagtc 


9068 


9087 1 


4 


SEQ ID NO: 630 


gaaattcaggctctggaac 


7467 


7486 


SEQ ID NO: 1633 


gttctcaattttgattttc 


8517 


8536 1 


4 


SEQ ID NO: 631 


actaccacaaaaagctgaa 


7484 


7503 


SEQ ID NO: 1634 


ttcaggaactattgctagt 


10637 


10656 1 


4 


SEQ ID NO: 632 


ccaaaataaccttaatcat 


7570 


7589 


SEQ ID NO: 1635 


atgatttccctgaccttgg 


10942 


10961 1 


4 


SEQ ID NO: 633 


aaataaccttaatcatcaa 


7573 


7592 


SEQ ID NO: 1636 


ttgaagtaaaagaaaattt 


10741 


10760 1 


4 


SEQ ID NO: 634 


tttaagttcagcatctttg 


7607 


7626 


SEQ ID NO: 1637 


caaatctggatttcttaaa 


9472 


9491 1 


4 


SEQ ID NO: 635 


caggtttatagcacacttg 


7731 


7750 


SEQ ID NO: 1638 


caagggttcactgttcctg 


7857 


7876 1 


4 


SEQ ID NO: 636 


gttcactgttcctgaaatc 


7862 


7881 


SEQ ID NO: 1639 




8914 


8933 1 


4 


SEQ ID NO: 637 


cactgttcctgaaatcaag 


7865 


7884 


SEQ ID NO: 1640 


cttgaacacaaagtcagtg 


6000 


6019 1 


4 


SEQ ID NO: 638 


actgttcctgaaatcaaga 


7866 


7885 


SEQ ID NO: 1641 


tcttgaacacaaagtcagt 


5999 


6018 1 


4 


SEQ ID NO: 639 


gcctgccfttgaagtcagt 


7901 


7920 


SEQ ID NO: 1642 


actgttgactcaggaaggc 


12572 


12591 1 


4 


SEQ ID NO: 640 


taacagatttgaggattcc 


7972 


7991 


SEQ ID NO: 1643 


ggaagcttctcaagagtta 


13214 13233 1 


4 


SEQ ID NO: 641 


gftttccacaccagaattt 


8042 


8061 


SEQ ID NO: 1644 


aaatttctctgctggaaac 


9410 


9429 1 


4 
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SEQ ID NO: 642 


tcagaaccattgaccagat 


8128 


8147 


SEQ ID NO: 1645 


atctgcagaacaatgctga 


12430 12449 1 


4 


SEQ ID NO: 643 


tagcgagaatcaccctgcc 


8218 


8237 


SEQ ID NO: 1646 


ggcagcttctggcttgcta 


12293 12312 1 


4 


SEQ ID NO: 644 


ccttaatgattttcaagtt 


8291 


8310 


SEQ ID NO: 1647 


aactgttgactcaggaagg 


12571 12590 1 


4 


SEQ ID NO: 645 


acataccagaattccagct 


8320 


8339 


SEQ ID NO: 1648 


agctgccagtccttcatgt 


10018 10037 1 


4 


SEQ ID NO: 646 


aatgctgacatagggaatg 


8430 


8449 


SEQ ID NO: 1649 




9997 10016 1 


4 


SEQ ID NO: 647 


atgctgacatagggaatgg 


8431 


8450 


SEQ ID NO: 1650 


ccaXagatcacgTcat 


9237 9256 1 


4 


SEQ ID NO: 648 


aaccaccteagcaaacgaa 8450 


8469 


SEQ ID NO: 1651 


ttcgttttccattaaggtt 


9283 9302 1 


4 


SEQ ID NO: 649 


agcaggtatcgcagcttcc 


8468 


8487 


SEQ ID NO: 1652 


ggaagtggccctgaatgct 


10964 10983 1 


4 


SEQ ID NO: 650 


tgcacaactctcaaaccct 


8543 


8562 


SEQ ID NO: 1653 




13493 13512 1 


4 


SEQ ID NO: 651 


aggagtcagtgaagttctc 


8584 


8603 


SEQ ID NO: 1654 


gagaacttactatcatcct 


13780 13799 1 


4 


SEQ ID NO: 652 


tttttggaaatgccattga 


8644 


8663 


SEQ ID NO: 1655 


tcaatgaatttattcaaaa 


13186 13205 1 


4 


SEQ ID NO: 653 


aatggagtgattgtcaaga 


8721 


8740 


SEQ ID NO: 1656 


tcttttcag cccag ccatt 


9223 9242 1 


4 


SEQ ID NO: 654 


gtcaagataaacaatcagc 


8733 


8752 


SEQ ID NO: 1657 


gctgactttaaaatctgac 


481 1 4830 1 


4 


SEQ ID NO: 655 


tccacaaattgaacatccc 


8779 


8798 


SEQ ID NO: 1658 


gggatttcctaaagctgga 


11164 11183 1 


4 


SEQ ID NO: 656 


ttgaacatccccaaactgg 


8787 


8806 


SEQ ID NO: 1659 


ccagtttccagggactcaa 


12595 12614 1 


4 


SEQ ID NO: 657 


acatccccaaactggactt 


8791 


8810 


SEQ ID NO: 1660 


aagtcgattcccagcatgt 


9082 9101 1 


4 


SEQ ID NO: 658 


acttctctagtcaggctga 


8806 


8825 


SEQ ID NO: 1661 


tcagatggaaaaatgaagt 


11002 11021 1 


4 


SEQ ID NO: 659 


tgaatcacaaattagtttc 


8936 


8955 


SEQ ID NO: 1662 


gaaagtccataatggttca 


12809 12828 1 


4 


SEQ ID NO: 660 


agaaggacccctcacttcc 


8960 


8979 


SEQ ID NO: 1663 


ggaagaagaggcagcttct 


12284 12303 1 


4 


SEQ ID NO: 661 


ttggactgtccaataagat 


8980 


8999 


SEQ ID NO: 1664 


atctaaatgcagtagccaa 


11626 11645 1 


4 


SEQ ID NO: 662 


actgtccaataagatcaat 


8984 


9003 


SEQ ID NO: 1665 


attgataaaaccatacagt 


13883 13902 1 


4 


SEQ ID NO: 663 


ctgtccaataagatcaata 


8985 


9004 


SEQ ID NO: 1666 


tattgataaaaccatacag 


13882 13901 1 


4 


SEQ ID NO: 664 


gtttatgaatctggctccc 


9033 


9052 


SEQ ID NO: 1667 


gggaatctgatgaggaaac 


12247 12266 1 


4 


SEQ ID NO: 665 


atgaatctggctccctcaa 


9037 


9056 


SEQ ID NO: 1668 


ttgagttgcccaccatcat 


11659 11678 1 


4 


SEQ ID NO: 666 


ctcaacttttctaaacttg 


9051 


9070 


SEQ ID NO: 1669 


caagatcgcagactttgag 


11645 11664 1 


4 


SEQ ID NO: 667 


ctaaaggcatggcactgtt 


9121 


9140 


SEQ ID NO: 1670 


aacagaaacaatgcattag 


9741 9760 1 


4 


SEQ ID NO: 668 


aaggcatggcactgtttgg 


9124 


9143 


SEQ ID NO: 1671 


ccaagaaaaggcacacctt 


11069 11068 1 


4 


SEQ ID NO: 669 




9254 


9273 


SEQ ID NO: 1672 


ccctaacagatttgaggat 


7969 7988 1 


4 


SEQ ID NO: 670 


ggaatttgaaagttcgttt 


9271 


9290 


SEQ ID NO: 1673 


aaacaaacacaggcattcc 


9647 9666 1 


4 


SEQ ID NO: 671 


aataactatgcactgtttc 


9324 


9343 


SEQ ID NO: 1674 


gaaatactgttttcctatt 


12828 12847 1 


4 


SEQ ID NO: 672 


gaaacaacgagaacattat 


9424 


9443 


SEQ ID NO: 1675 


ataaactgcaagatttttc 


13600 13619 1 


4 


SEQ ID NO: 673 


ttcttgaaaacgacaaagc 


9591 


9610 


SEQ ID NO: 1676 


gctttccaatgaccaagaa 


11057 11076 1 


4 


SEQ ID NO: 674 


ataagaaaaacaaacacag 9640 


9659 


SEQ ID NO: 1677 


ctgtgctttgtgagtttat 


9682 9701 1 


4 


SEQ ID NO: 675 


aaaacaaacacaggcattc 


9646 


9665 


SEQ ID NO: 1678 


gaatttgaaagttcgtttt 


9272 9291 1 


4 


SEQ ID NO: 676 


gcattccatcacaaatcct 


9659 


9678 


SEQ ID NO: 1679 


aggaagtggccctgaatgc 


10963 10982 1 


4 


SEQ ID NO: 677 


tttgaaaaaaacagaaaca 


9732 


9751 


SEQ ID NO: 1680 


tgttgaaagatttatcaaa 


12925 12944 1 


4 


SEQ ID NO: 678 


caatgcattagattttgtc 


9749 


9768 


SEQ ID NO: 1681 


gacaagaaaaaggggattg 


10271 10290 1 


4 


SEQ ID NO: 679 


caaagctgaaaaatctcag 


9809 


9828 


SEQ ID NO: 1682 


ctgagaacttcatcatttg 


11430 11449 1 


4 
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SEQ ID NO: 680 


cctggatacactgttccag 


9855 


9874 


SEQ ID NO: 1683 


ctg g acttctctag tea g g 


8802 8821 1 


4 


SEQ ID NO: 681 


gttgaagtgtctccattca 


9882 


9901 


SEQ ID NO: 1684 


tg aatctg g ctccctcaac 


9038 9057 1 


4 


SEQ ID NO: 682 


tttctccatcctaggttct 


9956 




SEQ ID NO: 1685 




6885 6904 1 


4 


SEQ ID NO: 683 


ttctccatcctaggttctg 


9957 


9976 


SEQ ID NO: 1686 


SgaatccJgatacaagaa 


6884 6903 1 




SEQ ID NO: 684 




10011 


10030 


SEQ ID NO: 1687 


ggacagtgaaatattatga 


13297 13316 1 




SEQ ID NO: 685 


tgctgaactttttaaccag 


10169 


10188 


SEQ ID NO: 1688 


ctg g atg taacc accagc a 


11178 11197 1 


4 


SEQ ID NO: 686 


ctcctttcttcatcttcat 


10206 


10225 


SEQ ID NO: 1689 


atgaagcttgctccaggag 


13764 13783 1 


4 


SEQ ID NO: 687 


tgtcattgatgcactgcag 


10226 


10245 


SEQ ID NO: 1690 




12072 12091 1 


4 


SEQ ID NO: 688 


tgatgcactgcagtacaaa 


10232 


10251 


SEQ ID NO: 1691 


tttgagttgcccaccatca 


11658 11677 1 




SEQ ID NO: 689 


agctctgtctctgagcaac 


10301 


10320 


SEQ ID NO: 1692 


gttgaccacaagcttagct 


10539 10558 1 


4 


SEQ ID NO: 690 


agccgaaattccaattttg 


10400 


10419 


SEQ ID NO: 1693 




13963 13982 1 


4 


SEQ ID NO: 691 


ttgagaatgaatttcaagc 


10416 


10435 


SEQ ID NO: 1694 


gcttcaggaagcttctcaa 


13208 13227 1 




SEQ ID NO: 692 


aaacctactgtctcttcct 


10461 


10480 


SEQ ID NO: 1695 


aggaaggccaagccagttt 


12583 12602 1 




SEQ ID NO: 693 


tacttttccattgagtcat 


10575 


10594 


SEQ ID NO: 1696 




12355 12374 1 




SEQ ID NO: 694 


tcaggtccatgcaagtcag 


10910 


10929 


SEQ ID NO: 1697 


rtgalttcttaggcactga 


4993 5012 1 




SEQ ID NO: 695 


atgcaagtcagcccagttc 


10918 


10937 


SEQ ID NO: 1698 


gaactcagaaggatggcat 


13994 14013 1 




SEQ ID NO: 696 


tgaatgctaacactaagaa 


10975 


10994 


SEQ ID NO: 1699 


ttctcaattttgattttca 


8518 8537 1 


4 


SEQ ID NO: 697 


agaagatcagatggaaaaa 


10996 


11015 


SEQ ID NO: 1700 


ttttctaaatggaacttct 


12165 12184 1 


4 


SEQ ID NO: 698 


ggctattcattctccatcc 


11256 


11275 


SEQ ID NO: 1701 


ggatctaaatgcagtagcc 


11624 11643 1 


* 


SEQ ID NO: 699 


aaagttttggctgataaat 


11280 


11299 


SEQ ID NO: 1702 




9481 9500 1 




SEQ ID NO: 700 


agttttggctgataaattc 


11282 


11301 


SEQ ID NO: 1703 


gaatctggctccctcaact 


9039 9058 1 


4 


SEQ ID NO: 701 


ctgggctgaaactaaatga 


11308 


11327 


SEQ ID NO: 1704 


tcattctgggtctttccag 


11027 11046 1 


4 


SEQ ID NO: 702 


cagagaaatacaaatctat 


11405 


11424 


SEQ ID NO: 1705 




8865 8884 1 


4 


SEQ ID NO: 703 


gaggtaaaattccctgaag 


11472 


11491 


SEQ ID NO: 1706 


cttctggcttgctaacctc 


12298 12317 1 




SEQ ID NO: 704 


cttttttgagataaccgtg 


11537 


11556 


SEQ ID NO: 1707 




13715 13734 1 




SEQ ID NO: 705 


gctggaattgtcattcctt 


11727 


11746 


SEQ ID NO: 1708 


aaggcatctccacctcagc 


12094 12113 1 




SEQ ID NO: 706 


gtgtataatgccacttgga 


11787 


11806 


SEQ ID NO: 1709 




13096 13115 1 


4 


SEQ ID NO: 707 


attccacatgcagctcaac 


11851 


11870 


SEQ ID NO: 1710 


gttgagaagccccaagaat 


6246 6265 1 


4 


SEQ ID NO: 708 


tgaagaagatggcaaattt 


11984 


12003 


SEQ ID NO: 1711 




9212 9231 1 


4 


SEQ ID NO: 709 


atcaaaagcccagcgttca 


12042 


12061 


SEQ ID NO: 1712 


tgaaagtcaagcatctgat 


12661 12680 1 


4 


SEQ ID NO: 710 


gtgggcatggatatggatg 


12135 


12154 


SEQ ID NO: 1713 


catccttaacaccttccac 


8063 8082 1 




SEQ ID NO: 711 


aaatggaacttctactaca 


12171 


12190 


SEQ ID NO: 1714 


tgtaccataagccatattt 


10080 10099 1 


4 


SEQ ID NO: 712 


aaaaactcaccatattcaa 


12211 


12230 


SEQ ID NO: 1715 


ttgatgttagagtgctttt 


6985 7004 1 


4 


SEQ ID NO: 713 


ctgagaagaaatctgcaga 


12420 


12439 


SEQ ID NO: 1716 


tctgcacagaaatattcag 


13439 13458 1 


4 


SEQ ID NO: 714 


acaatgctgagtgggttta 


12439 


12458 


SEQ ID NO: 1717 


taaatggagtctttattgt 


14078 14097 1 


4 


SEQ ID NO: 715 


caatgctgagtgggtttat 


12440 


12459 


SEQ ID NO: 1718 


ataaatggagtctttattg 


14077 14096 1 


4 


SEQ ID NO: 716 


ttaggcaaattgatgatat 


12469 


12488 


SEQ ID NO: 1719 


atattgtcagtgcctctaa 


13384 13403 1 


4 


SEQ ID NO: 717 


ataaactaatagatgtaat 


12889 


12908 


SEQ ID NO: 1720 


attactatgaaaaatttat 


13633 13652 1 


4 
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SEQ ID NO: 718 


ccaactaa tag aag at aac 


13031 


13050 


SEQ ID NO: 719 




13087 


13106 


SEQ ID NO: 720 


ttta aattg tt g a a a g a a a 


13143 


13162 


SEQ ID NO' 721 


aagttcaatgaatttattc 






SEQ ID NO: 722 


gaagaaaaga ag cag 


13318 


13337 


SEQ ID NO: 723 




13369 


13388 


SEQ ID NO: 724 


cacagaaatattcaggaat 


13443 


13462 


SEQ ID NO: 725 


ccattgcgacgaagaaaat 


13552 


13571 


SEQ ID NO: 726 


tata a a ctg caa g attttt 


13599 


13618 


SEQ ID NO: 727 


tctg attactatg a aaaat 


13629 


13648 


SEQ ID NO: 728 




13718 


13737 


SEQ ID NO: 729 


tgaagcttgctccaggaga 


13765 


13784 


SEQ ID NO: 730 


tgaactggacctgcaccaa 


13947 


13966 


SEQ ID NO: 731 




14050 


14069 


SEQ ID NO: 732 


gattcgaatatcaaattca 


4404 


4423 


SEQ ID NO: 733 


atttgtttgtcaaag aagt 


4543 


4562 


SEQ ID NO: 734 


tctcggttgctgccgctga 


25 


44 


SEQ ID NO: 735 


gctgaggagcccgcccagc 


39 


58 


SEQ ID NO: 736 


ctggtctgtccaaaagatg 


219 


238 


SEQ ID NO: 737 


ctgagagttccagtggagt 


283 


302 


SEQ ID NO: 738 


cagtgcaccctgaaagagg 


396 


415 


SEQ ID NO: 739 


ctctgaggagtttgctgca 


464 


483 


SEQ ID NO: 740 


acatcaagaggggcatcat 


574 


593 


SEQ ID NO: 741 


ctgatcagcagcagccagt 


822 


841 


SEQ ID NO: 742 


ggacgctaagaggaagcat 


857 


876 


SEQ ID NO: 743 


agctgttttgaagactctc 


1079 


1098 


SEQ ID NO: 744 


tgaaaaaactaaccatctc 


1105 


1124 


SEQ ID NO: 745 


ctgagctgagaggcctcag 


1168 


1187 


SEQ ID NO: 746 


tgaaacgtgtgcatgccaa 


1303 


1322 


SEQ ID NO: 747 


ccttgtatgcgctgagcca 


1432 


1451 


SEQ ID NO: 748 


aggagctgctggacattgc 


1492 


1511 


SEQ ID NO: 749 


atttg attctg egg g teat 


1567 


1586 


SEQ ID NO: 750 


tccagaactcaagtcttca 


1619 


1638 


SEQ ID NO: 751 


ggttcttcttcagactttc 


1736 


1755 


SEQ ID NO: 752 


gttgatgaggagtccttca 


1802 


1821 


SEQ ID NO: 753 


tccaagatctgaaaaagtt 


1933 


1952 


SEQ ID NO: 754 


agttagtgaaagaagttct 


1948 


1967 


SEQ ID NO: 755 


gaagggaatcttatatttg 


2076 


2095 



SEQ ID NO: 1721 gttattttgctaaacttgg 14044 14063 1 4 

SEQ ID NO: 1722 tcatcctctaattttttaa 13792 13811 1 4 

SEQ ID NO: 1723 tttcatttgaaagaataaa 7024 7043 1 4 

SEQ ID NO: 1724 gaataccaatgctgaactt 10160 10179 1 4 

SEQ ID NO: 1725 ctgagagaagtgtcttcaa 12399 12418 1 4 

SEQ ID NO: 1726 atatctggaaccttgaagt 10729 10748 1 4 

SEQ ID NO: 1727 attccctgaagttgatgtg 11480 11499 1 4 

SEQ ID NO: 1728 atttttattcctgccatgg 10095 10114 1 4 

SEQ ID NO: 1729 aaaattcaaactgcctata 13865 13884 1 4 

SEQ ID NO: 1730 atttgtaagaaaatacaga 6428 6447 1 4 

SEQ ID NO: 1731 cagcatgcctagtttctcc 9944 9963 1 4 

SEQ ID NO: 1732 tctcctttcttcatcttca 10205 10224 1 4 

SEQ ID NO: 1733 ttggtagagcaagggttca 7848 7867 1 4 

SEQ ID NO: 1734 cctcctacagtggtggcaa 4222 4241 1 4 

SEQ ID NO: 1735 tgaaaacgacaaagcaatc 9595 9614 3 3 

SEQ ID NO: 1736 acttttctaaacttgaaat 9055 9074 3 3 

SEQ ID NO: 1737 tcagcccagccatttgaga 9228 9247 2 3 

SEQ ID NO: 1738 gctggatgtaaccaccagc 11177 11196 2 3 

SEQ ID NO: 1739 catcagaaccattgaccag 8126 8145 2 3 

SEQ ID NO: 1740 aotcaatggtgaaattcag 7457 7476 2 3 

SEQ ID NO: 1741 cctcacttcctttggactg 8969 8988 2 3 

SEQ ID NO: 1742 tgcaaacttgacttcagag 11391 11410 2 3 

SEQ ID NO: 1743 atgacgttcttgagcatgt 7042 7061 2 3 

SEQ ID NO: 1744 actggacttctctagtcag 8801 8820 2 3 

SEQ ID NO: 1745 atgcctacgttccatgtcc 11346 11365 2 3 

SEQ ID NO: 1746 gagaagtgtcttcaaagct 12403 12422 2 3 

SEQ ID NO: 1747 gagatcaacacaatcttca 13104 13123 2 3 

SEQ ID NO: 1748 ctgaattactgcacctcag 3027 3046 2 3 

SEQ ID NO: 1749 ttggtagagcaagggttca 7848 7867 2 3 

SEQ ID NO: 1750 tggcactgtttggagaagg 9130 9149 2 3 

SEQ ID NO: 1751 gcaagtcagcccagttcct 10920 10939 2 3 

SEQ ID NO: 1752 atgaaaccaatgacaaaat 7420 7439 2 3 

SEQ ID NO: 1753 tgaaatacaatgctctgga 5512 5531 2 3 

SEQ ID NO: 1754 gaaataccaagtcaaaacc 10447 10466 2 3 

SEQ ID NO: 1755 tgaaaaagctgcaatcaac 13726 13745 2 3 

SEQ ID NO: 1756 aactgcttctccaaatgga 3544 3563 2 3 

SEQ ID NO: 1757 agaattcataatcccaact 8267 8286 2 3 

SEQ ID NO: 1758 caaaacctactgtctcttc 10459 10478 2 3 
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SEQ ID NO: 756 


ggaagctctttttgggaag 


2213 


2232 


SEQ ID NO: 1759 


cttcacataccagaattcc 


8316 8335 2 


3 


SEQ ID NO: 757 


tggaataatgctcagtgtt 


2366 


2385 


SEQ ID NO: 1760 


aacaaacacaggcattcca 


9648 9667 2 


3 


SEQ ID NO: 758 


gatttgaaatccaaagaag 


2400 


2419 


SEQ ID NO: 1761 


cttcatgtccctagaaatc 


10029 10048 2 


3 


SEQ ID NO: 759 


tccaaagaagtcccggaag 


2409 


2428 


SEQ ID NO: 1762 


cttcagcctgctttctgga 


4943 4962 2 


3 


SEQ ID NO: 760 


aggaagggctcaaagaatg 


2562 


2581 


SEQ ID NO: 1763 


cattagagctgccagtcct 


10012 10031 2 


3 


SEQ ID NO: 761 


agaatgacttttttcttca 


2575 


2594 


SEQ ID NO: 1764 


tgaagatgacgacttttct 


12152 12171 2 


3 


SEQ ID NO: 762 


tttgtgacaaatatgggca 


2757 


2776 


SEQ ID NO: 1765 


tgccagttlgaaaaacaaa 


11807 11826 2 


3 


SEQ ID NO: 763 


ctgaggctaccatgacatt 


3244 


3263 


SEQ ID NO: 1766 


aatgtcagctcttgttcag 


10895 10914 2 


3 


SEQ ID NO: 764 


gtagataccaaaaaaatga 


3660 


3679 


SEQ ID NO: 1767 


tcatttgccctcaacctac 


11442 11461 2 


3 


SEQ ID NO: 765 


aaatgacttccaatttccc 


3673 


3692 


SEQ ID NO: 1768 


gggaactgttgaaagattt 


12919 12938 2 


3 


SEQ ID NO: 766 


atgacttccaatttocctg 


3675 


3694 


SEQ ID NO: 1769 


caggagaacttactatcat 


13777 13796 2 


3 


SEQ ID NO: 767 


atctgccatctcgagagtt 


4096 


4115 


SEQ ID NO: 1770 


aactcctccactgaaagat 


9539 9558 2 


3 


SEQ ID NO: 768 


atttgtttgtcaaagaagt 


4543 


4562 


SEQ ID NO: 1771 


acttccgtttaccagaaat 


8239 8258 2 


3 


SEQ ID NO: 769 


gcagagcttggcctctctg 


5127 


5146 


SEQ ID NO: 1772 


cagagctttctgccactgc 


13510 13529 2 


3 


SEQ ID NO: 770 


atatgctgaaatgaaattt 


5345 


5364 


SEQ ID NO: 1773 


aaattcaaactgcctatat 


13866 13885 2 


3 


SEQ ID NO: 771 


tcaaaacttgacaacattt 


5412 


5431 


SEQ ID NO: 1774 


aaatacttccacaaattga 


8772 8791 2 


3 


SEQ ID NO: 772 


cagtgacctgaaatacaat 


5504 


5523 


SEQ ID NO: 1775 


attgaacatccccaaactg 


8786 8805 2 


3 


SEQ ID NO: 773 


tacaaatggcaatgggaaa 


5840 


5859 


SEQ ID NO: 1776 


tttcaactgcctttgtgta 


11221 11240 2 


3 


SEQ ID NO: 774 


cttttgtaaagtatgataa 


6277 


6296 


SEQ ID NO: 1777 


ttattgctgaatccaaaag 


13648 13667 2 


3 


SEQ ID NO: 775 


ttgtaaagtatgataaaaa 


6280 


6299 


SEQ ID NO: 1778 


ttttcaagcaaatgcacaa 


8531 8550 2 


3 


SEQ ID NO: 776 


tccattaacctcccatttt 


6312 


6331 


SEQ ID NO: 1779 


aaaagaaaattttgctgga 


10748 10767 2 


3 


SEQ ID NO: 777 


gattatctgaattcattca 


6480 


6499 


SEQ ID NO: 1780 


tgaagtagaccaacaaatc 


7154 7173 2 


3 


SEQ ID NO: 778 


aattgggagagacaagttt 


6498 


6517 


SEQ ID NO: 1781 


aaactaaatgatctaaatt 


11316 11335 2 


3 


SEQ ID NO: 779 


atttgaaaatagctattgc 


6688 


6707 


SEQ ID NO: 1782 


gcaatttctgcacagaaat 


13433 13452 2 


3 


SEQ ID NO: 780 


tgagcatgtcaaacacttt 


7052 


7071 


SEQ ID NO: 1783 


aaagccattcagtctctca 


12963 12982 2 


3 


SEQ ID NO: 781 


ttgaagatgttaacaaatt 


7348 


7367 


SEQ ID NO: 1784 


aattccatatgaaagtcaa 


12652 12671 2 


3 


SEQ ID NO: 782 


acttgtcacctacatttct 


7745 


7764 


' SEQ ID NO: 1785 


agaatattttgatccaagt 


13268 13287 2 


3 


SEQ ID NO: 783 


gttttccacaccagaattt 


8042 


8061 


SEQ ID NO: 1786 


aaatctggatttcttaaac 


9473 9492 2 


3 


SEQ ID NO: 784 


ataagtacaaccaaaattt 


9397 


9416 


SEQ ID NO: 1787 


aaataaatggagtctttat 


14075 14094 2 


3 


SEQ ID NO: 785 


cgggacctgcggggctgag 


0 


19 


SEQ ID NO: 1788 


ctcagttaactgtgtcccg 


11563 11582 1 


3 


SEQ ID NO: 786 


agtgcccttctcggttgct 


17 


36 


SEQ ID NO: 1789 


agcatctgattgactcact 


12670 12689 1 


3 


SEQ ID NO: 787 


gctgaggagcccgcccagc 


39 


58 


SEQ ID NO: 1790 


gctgattgaggtgtccagc 


1217 1236 1 


3 


SEQ ID NO: 788 


gaggagcccgcccagccag 


42 


61 


SEQ ID NO: 1791 


ctggatcacagagtccctc 


3744 3763 1 


3 


SEQ ID NO: 789 


gggccgcgaggccgaggcc 


64 


83 


SEQ ID NO: 1792 


ggccctgatccccgagccc 


1355 1374 1 


3 


SEQ ID NO: 790 


ccaggccgcagcccaggag 81 


100 


SEQ ID NO: 1793 


ctcccggagccaaggctgg 


2674 2693 1 


3 


SEQ ID NO: 791 


ggagccgccccaccgcagc 


96 


115 


SEQ ID NO: 1794 


gctgttttgaagactctcc 


1080 1099 1 


3 


SEQ ID NO: 792 


gaagaggaaatgctggaaa 


192 


211 


SEQ ID NO: 1795 


tttcaagttcctgaccttc 


8301 8320 1 


3 


SEQ ID NO: 793 


caaaagatgcgacccgatt 


229 


248 


SEQ ID NO: 1796 


aatcttattggggattttg 


7077 7096 1 


3 
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SEQ ID NO: 794 


attcaagcacctccggaag 


245 




SEQ ID NO: 1797 


c ccaca caaggaa 




10078 1 


3 


SEQ ID NO: 795 


gttccagtggagtccctgg 


289 


308 


SEQ ID NO: 




ccagcaag acc gagaac 


8602 




3 


SEQ ID NO: 796 


gactgctgattcaagaagt 


308 


327 


SEQ ID NO' 


1799 




13316 


13335 1 


3 


SEQ ID NO: 797 


gtgccaccaggatcaactg 


325 


344 


SEQ ID NO: 


1800 


ca aa g 9a ^ aa9a ^ 
rag aagc cagggcac 


10696 


10715 1 


3 


SEQ ID NO: 798 


gatcaactgcaaggttgag 


335 


354 


SEQ ID NO 


1801 


ccaccccacccgac 


4740 


4759 1 


3 


SEQ ID NO: 799 


actgcaaggttgagctgga 


340 


359 


SEQ ID NO 


1802 




1281 


1300 1 


3 


SEQ ID NO: 800 


ccagctctgcagcttcatc 






SEQ ID NO: 


1803 


gTtgtggteLctecdgg 


1335 


1354 1 




SEQ ID NO: 801 


agcttcatcctgaagacca 


375 


394 


SEQ ID NO: 


1804 


tg g tg ct g g ag aatg ag ct 


5104 


5123 1 


3 


SEQ ID NO: 802 


cttcatcctgaagaccagc 


377 


396 


SEQ ID NO' 




gctggagtaaaactggaag 


2688 


2707 1 


3 


SEQ ID NO: 803 








SEQ ID NO: 


1806 


ttcaagatgactgcactgg 








SEQ ID NO: 804 




396 


415 


SEQ ID NO: 


1807 




522? 


5241 1 


^ 


SEQ ID NO: 805 


tg g cttcaa ccctg ag gg c 


419 


438 


SEQ ID NO: 


1808 




3525 


3544 1 




SEQ ID NO: 806 


cttcaaccctg ag g g caaa 


422 


441 


SEQ ID NO: 


1809 


tttgagccaacattggaag 


2199 


2218 1 


3 


SEQ ID NO: 807 


ttcaaccctgagggcaaag 


423 


442 


SEQ ID NO: 




ctttgacaggcattttgaa 


9719 


9738 1 


3 


SEQ ID NO: 808 








SEQ ID NO 


1811 


cttgaaattcaatcacaag 








SEQ ID NO: 809 


tgctgaagaaaaccaagaa 


445 


464 


SEQ ID NO: 


1812 




5639 


5658 1 




SEQ ID NO: 810 


ttgctgcagccatgtccag 


475 


494 


SEQ ID NO' 


1813 




2996 


3015 1 




SEQ ID NO: 811 


tgctgcagccatgtccagg 


476 


495 


SEQ ID NO 


1814 


0 T 7° 9caa9caa 

ccggcag gcaagca 


2995 


3014 1 


3 


SEQ ID NO: 812 


agccatgtccaggtatgag 


482 


501 


SEQ ID NO 






1285 


1304 1 


3 


SEQ ID NO: 813 


ag ctca a g ctg g ccattcc 


499 




SEQ ID NO 


1816 








3 


SEQ ID NO: 814 


agaagggaagcaggttttc 


518 


537 


SEQ ID NO 




9 !IaLteaaffiat a | a t a9C 


13813 


13832 1 


3 


SEQ ID NO: 815 


aagggaagcaggttttcct 


520 


539 


SEQ ID NO 


1818 


aggacaccaaaa aacc 








SEQ ID NO: 816 








SEQ ID NO 


1819 




4844 


4863 1 


3 


SEQ ID NO: 817 


atcctgaacatcaagaggg 


567 


586 


SEQ ID NO 


1820 


c^rica ° 
ccc aacaga gagga 


7969 


7988 1 




SEQ ID NO: 818 








SEQ ID NO 


1821 




7968 


7987 1 




SEQ ID NO: 819 


ctgaTcTteaTgTg^gTca 


570 


589 


SEQ ID NO 


1822 


trcTtg^cmJaagteag 3 


7900 


7919 1 




SEQ ID NO: 820 


aacatcaagaggggcatca 


573 


592 


SEQ ID NO 


1823 


tgataaaaaccaagatgtt 


6290 


6309 1 


3 


SEQ ID NO: 821 


acatcaagaggggcatcat 


574 


593 


SEQ ID NO 


1824 


atgataaaaaccaagatgt 


6289 


6308 1 


3 


SEQ ID NO: 822 


tcatttctgccctcctggt 


589 


608 


SEQ ID NO 


1825 


accaccagtttgtagatga 


7405 


7424 1 


3 


SEQ ID NO: 823 


ttcccccagagacagaaga 


607 


626 


SEQ ID NO 


1826 


tcttccacatttcaaggaa 


10058 


10077 1 




SEQ ID NO: 824 


gaagaagccaagcaagtgt 


621 


640 


SEQ ID NO 


1827 


acaccttccacattccttc 


8071 


8090 1 


3 


SEQ ID NO: 825 


ttgtttctggataccgtgt 


639 


658 


SEQ ID NO 


1828 


acactaaatacttccacaa 


8767 


8786 1 


3 


SEQ ID NO: 826 


tgtatggaaactgctccac 


655 


674 


SEQ ID NO 


1829 


gtggaggcaacacattaca 


2920 


2939 1 


3 


SEQ ID NO: 827 


aaactgctccactcacttt 


662 


681 


SEQ ID NO 


1830 


aaagaaacagcatttgttt 


4532 


4551 1 


3 


SEQ ID NO: 828 


actcactttaccgtcaaga 


672 


691 


SEQ ID NO 


1831 


tcttacttttccattgagt 


10572 


10591 1 


3 


SEQ ID NO: 829 


ctttaccgtcaagacgagg 


677 


696 


SEQ ID NO 


1832 


cctccagctcctgggaaag 


2483 


2502 1 


3 


SEQ ID NO: 830 


ttaccgtcaagacgaggaa 


679 


698 


SEQ ID NO 


1833 


ttcctaaagctggatgtaa 


11169 


11188 1 


3 


SEQ ID NO: 831 


acgaggaagggcaatgtgg 


690 


709 


SEQ ID NO 


1834 


ccacaagtcatcatctcgt 


5956 


5975 1 


3 
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SEQ ID NO: 832 


cgaggaagggcaatgtggc 


691 


710 


SEQ ID NO: 1835 


gccagaagtgagatcctcg 


3507 


3526 1 


3 


SEQ ID NO: 833 


gaggaagggcaatgtggca 


692 


711 


SEQ ID NO: 1836 


tgccagtctccatgacctc 


2468 


2487 1 


3 


SEQ ID NO: 834 


ggaagggcaatgtggcaac 


694 


713 


SEQ ID NO: 1837 


gttgctcttaaggacttcc 


13356 13375 1 


3 


SEQ ID NO: 835 


gaagggcaatgtggcaaca 


695 


714 


SEQ ID NO: 1838 


tgttgatgaggagtccttc 


1801 


1820 1 


3 


SEQ ID NO: 836 


caggcatcagcccacttgc 


769 


788 


SEQ ID NO: 1839 


gcaagtctttcctggcctg 


3011 


3030 1 


3 


SEQ ID NO: 837 


aggcatcagcccacttgct 


770 


789 


SEQ ID NO: 1840 


agcaagtctttcctggcct 


3010 


3029 1 


3 


SEQ ID NO: 838 


tcagcccacttgctctcat 


775 


794 


SEQ ID NO: 1841 


atgaaagtcaagcatctga 


12660 


12679 1 


3 


SEQ ID NO: 839 
SEQ ID NO: 840 


gtcaactctgatcagcagc 


815 


834 
876 


SEQ ID NO: 1842 


gctgactttaaaatetgac 
atgcactgtttctgagtcc 


4811 


4830 1 


3 


ggacgctaagaggaagcat 


857 


SEQ ID NO: 1843 


9331 


9350 1 


3 


SEQ ID NO: 841 


aaggagcaacacctcttcc 


894 


913 


SEQ ID NO: 1844 


ggaatatcttagcatcctt 


13457 


13476 1 


3 


SEQ ID NO: 842 


aggagcaacacctcttcct 


895 


914 


SEQ ID NO: 1845 


aggaatatcttagcatcct 


13456 


13475 1 


3 


SEQ ID NO: 843 


caacacctcttcctgcctt 


900 


919 


SEQ ID NO: 1846 


aaggctgactctgtggttg 


4284 


4303 1 


3 


SEQ ID NO: 844 


aacacctcttcctgccttt 


901 


920 


SEQ ID NO: 1847 


aaagcaggccgaagctgtt 


1067 


1086 1 


3 


SEQ ID NO: 845 


acaagaataagtatgggat 


925 


944 


SEQ ID NO: 1848 


atccatgatctacatttgt 


6786 


6805 1 


3 


SEQ ID NO: 846 


caagaataagtatgggatg 


926 


945 


SEQ ID NO: 1849 


catcactttacaagccttg 


1238 


1257 1 


3 


SEQ ID NO: 847 


tagcacaagtgacacagac 


946 


965 


SEQ ID NO: 1850 


gtctcttcgttctatgcta 


4584 


4603 1 


3 


SEQ ID NO: 848 


agcacaagtgacacagact 


947 


966 


SEQ ID NO: 1851 


agtctcttcgttctatgct 


4583 


4602 1 


3 


SEQ ID NO: 849 


gcacaagtgacacagactt 


948 


967 


SEQ ID NO: 1852 


aagtgtagtctcctggtgc 


5091 


5110 1 


3 


SEQ ID NO: 850 


aacttgaagacacaccaaa 


970 


989 


SEQ ID NO: 1853 


tttgaggattccatcagtt 


7979 


7998 1 


3 


SEQ ID NO: 851 


gcttctttggtgaaggtac 


1000 


1019 


SEQ ID NO: 1854 


gtacctacttttggcaagc 


8364 


8383 1 


3 


SEQ ID NO: 852 


ctttggtgaaggtactaag 


1004 


1023 


SEQ ID NO: 1855 


cttatgggatttcctaaag 


11159 11178 1 


3 


SEQ ID NO: 853 
SEQ ID NO: 854 


tactaagaagatgggcctc 
tttgagagcaccaaatcca 


1016 
1038 
1042 


1035 
1057 
1061 


SEQ ID NO: 1856 
SEQ ID NO: 1857 
SEQ ID NO: 1858 


gagggtagtcataacagta 


10329 
10372 
4868 


10348 1 
10391 1 


3 
3 


SEQ ID NO: 855 


agagcaccaaatccacatc 


tggaagtgtcagtggcaaa 
gatggatatgaccttctct 


4887 1 


3 


SEQ ID NO: 856 


agctgttttgaagactctc 


1079 


1098 


SEQ ID NO: 1859 


gagaacatactgggcagct 


5872 


5891 1 


3 


SEQ ID NO: 857 


tgaaaaaactaaccatctc 


1105 


1124 


SEQ ID NO: 1860 


gagaaaatcaatgccttca 


7104 


7123 1 


3 


SEQ ID NO: 858 


gaaaaaactaaccatctct 


1106 


1125 


SEQ ID NO: 1861 


agagccaggtcgagctttc 


11044 


11063 1 


3 


SEQ ID NO: 859 


tctgagcaaaatatccaga 


1122 


1141 


SEQ ID NO: 1862 


tctgatgaggaaactcaga 


12252 12271 1 


3 


SEQ ID NO: 860 


tctcttcaataagctggtt 


1148 


1167 


SEQ ID NO: 1863 


aacctcccattttttgaga 


6318 


6337 1 


3 


SEQ ID NO: 861 


ctgagctgagaggcctcag 


1168 


1187 


SEQ ID NO: 1864 


ctgatccccgagccctcag 


1359 


1378 1 


3 


SEQ ID NO: 862 


tgaagcagtcacatctctc 


1190 


1209 


SEQ ID NO: 1865 


gagaaaatcaatgccttca 


7104 


7123 1 


3 


SEQ ID NO: 863 


aagcagtcacatctctctt 


1192 


1211 


SEQ ID NO: 1866 


aagaggcagcttctggctt 


12289 12308 1 


3 


SEQ ID NO: 864 


ctctcttgccacagctgat 


1204 


1223 


SEQ ID NO: 1867 


atcaaaagaagcccaagag 


12938 


12957 1 


3 


SEQ ID NO: 865 


tcttgccacagctgattga 


1207 


1226 


SEQ ID NO: 1868 


tcaaagttaattgggaaga 


12271 


12290 1 


3 


SEQ ID NO: 866 


cttgccacagctgattgag 


1208 


1227 


SEQ ID NO: 1869 


ctcaattttgattttcaag 


8520 


8539 1 


3 


SEQ ID NO: 867 


tgaggtgtccagccccatc 


1223 


1242 


SEQ ID NO: 1870 


gatggaaccctctccctca 


4725 


4744 1 


3 


SEQ ID NO: 868 


tcagtgtggacagcctcag 


1259 


1278 


SEQ ID NO: 1871 


ctgacatcttaggcactga 


4993 


5012 1 


3 


SEQ ID NO: 869 


acatcctccagtggctgaa 


1288 


1307 


SEQ ID NO: 1872 


ttcagaagctaagcaatgt 


7231 


7250 1 


3 
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SEQ ID NO: 870 


gcacagcagctgcgagags 


1377 


1396 


SEQ ID NO: 1873 


tctctgaaagacaacgtgc 


12315 12334 1 


3 


SEQ ID NO: 871 


cagcagctgcgagagatct 


1380 


1399 


SEQ ID NO: 1874 


agataacattaaacagctg 


13043 13062 1 


3 


SEQ ID NO: 872 


gcgagggatcagcgcagcc 


1407 


1426 


SEQ ID NO: 1875 


ggctcaacacagacatcgc 


5710 


5729 1 


3 


SEQ ID NO: 873 


aagacaaaccctacagggc 


1470 


1489 


SEQ ID NO: 1876 


tcccagaaaacctcttctt 


3928 


3947 1 


3 


SEQ ID NO: 874 


caggagctgctggacattg 


1491 


1510 


SEQ ID NO: 1877 


caatggagagtccaacctg 


4652 


4671 1 


3 


SEQ ID NO: 875 


aggagctgctggacattgc 


1492 


1511 


SEQ ID NO: 1878 


gcaagggttcactgttcct 


7856 


7875 1 


3 


SEQ ID NO: 876 


ctgctggacattgctaatt 


1497 


1516 


SEQ ID NO: 1879 


aattgggaagaagaggcac 


12279 


12298 1 


3 


SEQ ID NO: 877 


gattacacctatttgattc 


1557 


1576 


SEQ ID NO: 1880 


gaatattltgagaggaatc 


6345 


6364 1 


3 


SEQ ID NO: 878 


atttgattctgcgggtcat 


1567 


1586 


SEQ ID NO: 1881 


atgaagtagaccaacaaat 


7153 


7172 1 


3 


SEQ ID NO: 879 


tctgcgggtcattggaaat 


1574 


1593 


SEQ ID NO: 1882 


atttgtaagaaaatacaga 


6428 


6447 1 


3 


SEQ ID NO: 880 


aaccatggagcagttaact 


1601 


1620 


SEQ ID NO: 1883 


agtttctccatcctaggtt 


9954 


9973 1 


3 


SEQ ID NO: 881 


ggagcagttaactccagaa 


1607 


1626 


SEQ ID NO: 1884 


ttctgaaaatccaatctcc 


8392 


8411 1 


3 


SEQ ID NO: 882 


actccagaactcaagtctt 


1617 


1636 


SEQ ID NO: 1885 


aagatcgcagactttgagt 


11646 


11665 1 


3 


SEQ ID NO: 883 


tccagaactcaagtcttca 


1619 


1638 


SEQ ID NO: 1886 


tgaactcagaagaattgga 


1912 


1931 1 


3 


SEQ ID NO: 884 


aagtacaaagccatcactg 


1655 


1674 


SEQ ID NO: 1887 


cagtcatgtagaaaaactt 


4421 


4440 1 


3 


SEQ ID NO: 885 


gccatcactgatgatccag 


1664 


1683 


SEQ ID NO: 1888 


ctggaactctctccatggc 


10875 


10894 1 


3 


SEQ ID NO: 886 


ccatcactgatgatccaga 


1665 


1684 


SEQ ID NO: 1889 


tctgaactcagaaggatgg 


13991 


14010 1 


3 


SEQ ID NO: 887 


atccagaaagctgccatcc 


1677 


1696 


SEQ ID NO: 1890 


ggatttcctaaagctggat 


11165 11184 1 


3 


SEQ ID NO: 888 


cagaaagctgocatccagg 


1680 


1699 


SEQ ID NO: 1891 


cctgaaatacaatgctctg 


5510 


5529 1 


3 


SEQ ID NO: 889 


acaaggaccaggaggttct 


1723 


1742 


SEQ ID NO: 1892 


agaaacagcatttgtttgt 


4534 


4553 1 


3 


SEQ ID NO: 890 


aggaccaggaggttcttct 


1726 


1745 


SEQ ID NO: 1893 


agaagctaagcaatgtcct 


7234 


7253 1 


3 


SEQ ID NO: 891 


accaggaggttcttcttca 


1729 


1748 


SEQ ID NO: 1894 


tgaaggctgactctgtggt 


4282 


4301 1 


3 


SEQ ID NO: 892 


tcttcagactttccttgat 


1742 


1761 


SEQ ID NO: 1895 


atcaggaagggctcaaaga 


2559 


2578 1 


3 


SEQ ID NO: 893 


ttcagactttccttgatga 


1744 


1763 


SEQ ID NO: 1896 


tcattactcctgggctgaa 


11299 11318 1 


3 


SEQ ID NO: 894 


gttgatgaggagtccttca 


1802 


1821 


SEQ ID NO: 1897 


tgaatctggctccctcaac 


9038 


9057 1 


3 


SEQ ID NO: 895 


cttcacaggcagatattaa 


1816 


1835 


SEQ ID NO: 1898 


ttaatcgagaggtatgaag 


7140 


7159 1 


3 


SEQ ID NO: 896 


ttcacaggcagatattaac 


1817 


1836 


SEQ ID NO: 1899 


gttaatcgagaggtatgaa 


7139 


7158 1 


3 


SEQ ID NO: 897 


ggcagaiattaacaaaatt 


1823 


1842 


SEQ ID NO: 1900 


aattgcattagatgatgcc 


6581 


6600 1 


3 


SEQ ID NO: 898 


atattaacaaaattgtcca 


1828 


1847 


SEQ ID NO: 1901 


tggagtttgtgacaaatat 


2752 


2771 1 


3 


SEQ ID NO: 899 


acaaaattgtccaaattct 


1834 


1853 


SEQ ID NO: 1902 


agaaacagcatttgtttgt 


4534 


4553 1 


3 


SEQ ID NO: 900 


gagcaagtgaagaactttg 


1869 


1888 


SEQ ID NO: 1903 


caaatgacatgatgggctc 


5326 


5345 1 


3 


SEQ ID NO: 901 


gtgaagaactttgtggctt 


1875 


1894 


SEQ ID NO: 1904 


aagcatctgattgactcac 


12669 


12688 1 


3 


SEQ ID NO: 902 


agaactttgtggcttccca 


1879 


1898 


SEQ ID NO: 1905 


tgggcctgccccagattct 


8901 


8920 1 


3 


SEQ ID NO: 903 


tttgtggcttcccatattg 


1884 


1903 


SEQ ID NO: 1906 


caataagatcaatagcaaa 


8990 


9009 1 


3 


SEQ ID NO: 904 


tggcttcccatattgccaa 


1888 


1907 


SEQ ID NO: 1907 


ttggctcacatgaaggcca 


7623 


7642 1 


3 


SEQ ID NO: 905 


ttcccatattgccaatatc 


1892 


1911 


SEQ ID NO: 1908 


gatatacactagggaggaa 


12737 


12756 1 


3 


SEQ ID NO: 906 


tcccatattgccaatatct 


1893 


1912 


SEQ ID NO: 1909 


agatcaaagttaattggga 


12268 


12287 1 


3 


SEQ ID NO: 907 


ttgccaatatcttgaactc 


1900 


1919 


SEQ ID NO: 1910 


gagtcccagtgcccagcaa 


9344 


9363 1 


3 
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SEQ ID NO: 908 


ttggatatccaagatctga 


1926 


1945 


SEQ ID NO: 1911 


tcagtataagtacaaccaa 


9392 


9411 1 


3 


SEQ ID NO: 909 


tccaagatctgaaaaagtt 


1933 


1952 


SEQ ID NO: 1912 


aacttccaactgtcatgga 


1978 


1997 1 


3 


SEQ ID NO: 910 




1941 


1960 


SEQ ID NO: 1913 


ctttgaagtcagtcttcag 


7907 


7926 1 


3 


SEQ ID NO: 911 


agttagtgaaagaagttct 


1948 


1967 


SEQ ID NO: 1914 


agaatctcaacttccaact 


1970 


1989 1 


3 


SEQ ID NO: 912 


aatctcaacttccaactgt 


1972 


1991 


SEQ ID NO: 1915 


acaggggtcctttatgatt 


12342 


12361 1 


3 


SEQ ID NO: 913 


gtcatggacttcagaaaat 


1989 


2008 


SEQ ID NO: 1916 


atttgaaagaataaatgac 


7028 


7047 1 


3 


SEQ ID NO: 914 


tcaactotacaaatctgtt 


2021 


2040 


SEQ ID NO: 1917 


aacacattgaggctattga 


6970 


6989 1 


3 


SEQ ID NO: 915 


aactctacaaatctgtttc 


2023 


2042 


SEQ ID NO: 1918 


gaaaaaggggattgaagtt 


10276 10295 1 


3 


SEQ ID NO: 916 


aaatagaagggaatcttat 


2071 


2090 


SEQ ID NO: 1919 


ataagcaaactgttaattt 


5449 


5468 1 


3 


SEQ ID NO: 917 


agaagggaatcttatattt 


2075 


2094 


SEQ ID NO: 1920 


aaatgcactgctgcgttct 


4892 


4911 1 


3 


SEQ ID NO: 918 


gaagggaatcttatatttg 


2076 


2095 


SEQ ID NO: 1921 


caaaaacattttcaacttc 


5279 


5298 1 


3 


SEQ ID NO: 919 


tgatccaaataactacctt 


2093 


2112 


SEQ ID NO: 1922 


aaggaagaaagaaaaatca 3453 


3472 1 


3 


SEQ ID NO: 920 


tggatttgcttcagctgac 


2150 


2169 


SEQ ID NO: 1923 


gtcagcccagttccttcca 


10924 


10943 1 


3 


SEQ ID NO: 921 


tttgcttcagctgacctca 


2154 


2173 


SEQ ID NO: 1924 


tgaggaaactcagatcaaa 


12257 12276 1 


3 


SEQ ID NO: 922 


cttggaaggaaaagg cttt 


2183 


2202 


SEQ ID NO: 1925 


aaagcattggtagagcaag 


7842 


7861 1 


3 


SEQ ID NO: 923 


tggaaggaaaaggctttga 


2185 


2204 


SEQ ID NO: 1926 


tcaagtctgtgggattcca 


4078 


4097 1 


3 


SEQ ID NO: 924 


ggctttgagccaacattgg 


2196 


2215 


SEQ ID NO: 1927 


ccaagaggtatttaaagcc 


12950 


12969 1 


3 


SEQ ID NO: 925 


tgagccaacattggaagct 


2201 


2220 


SEQ ID NO: 1928 


agctttctgccactgctca 


13513 13532 1 


3 


SEQ ID NO: 926 


gagccaacattggaagctc 


2202 


2221 


SEQ ID NO: 1929 


gagctttctgccactgctc 


13512 13531 1 


3 


SEQ ID NO: 927 


aacattggaagctcttttt 


2207 


2226 


SEQ ID NO: 1930 


aaaagaaacagcatttgtt 


4531 


4550 1 


3 


SEQ ID NO: 928 


tggaagctctttttgggaa 


2212 


2231 


SEQ ID NO: 1931 


ttccggcacgtgggttcca 


3777 


3796 1 


3 


SEQ ID NO: 929 


ctctttttgggaagcaagg 


2218 


2237 


SEQ ID NO: 1932 


ccttactgactttgcagag 


7790 


7809 1 


3 


SEQ ID NO: 930 


tttttgggaagcaaggatt 


2221 


2240 


SEQ ID NO: 1933 


aatcattgaaaaattaaaa 


6722 


6741 1 


3 


SEQ ID NO: 931 


ttttcccagacagtgtcaa 


2239 


2258 


SEQ ID NO: 1934 


ttgatgaaatcattgaaaa 


6715 


6734 1 


3 


SEQ ID NO: 932 


ttggctataccaaagatga 


2323 


2342 


SEQ ID NO: 1935 


tcattgctcccggagccaa 


2668 


2687 1 


3 


SEQ ID NO: 933 


ataccaaagatgataaaca 


2329 


2348 


SEQ ID NO: 1936 


tgttgcttttgtaaagtat 


6272 


6291 1 


3 


SEQ ID NO: 934 


gagcaggatatggtaaatg 


2349 


2368 


SEQ ID NO: 1937 


catttcagccttcgggctc 


4254 


4273 1 


3 


SEQ ID NO: 935 


atggtaaatggaataatgc 


2358 


2377 


SEQ ID NO: 1938 


gcatgcctagtttctccat 


9946 


9965 1 


3 


SEQ ID NO: 936 


tggtaaatggaataatgct 


2359 


2378 


SEQ ID NO: 1939 


agcacagtacgaaaaacc 


5 10801 


10820 1 


3 


SEQ ID NO: 937 


taaatggaataatgctcag 


2362 


2381 


SEQ ID NO: 1940 


ctgaaagagatgaaattta 


13059 13078 1 


3 


SEQ ID NO: 938 


tggaataatgctcagtgtt 


2366 


2385 


SEQ ID NO: 1941 


aacagatttgaggattcca 


7973 


7992 1 


3 


SEQ ID NO: 939 


tcagtgttgagaagctgat 


2377 


2396 


SEQ ID NO: 1942 


atcacaactcctccactga 


9534 


9553 1 


3 


SEQ ID NO: 940 




2378 


2397 


SEQ ID NO: 1943 


aatcacaactcctccactg 


9533 


9552 1 


3 


SEQ ID NO: 941 


agtgttgagaagctgatta 


2379 


2398 


SEQ ID NO: 1944 


taatcacaactcctccact 


9532 


9551 1 


3 


SEQ ID NO: 942 


gattaaagatttgaaatcc 


2393 


2412 


SEQ ID NO: 1945 


ggatactaagtaccaaatc 


6866 


6885 1 


3 


SEQ ID NO: 943 


gatttgaaatccaaagaag 


2400 


2419 


SEQ ID NO: 1946 


cttccgtttaccagaaatc 


8240 


8259 1 


3 


SEQ ID NO: 944 


atttgaaatccaaagaagt 


2401 


2420 


SEQ ID NO: 1947 


acttccgtttaccagaaat 


8239 


8258 1 


3 


SEQ ID NO: 945 


atccaaagaagtcccggaa 


2408 


2427 


SEQ ID NO: 1948 


ttccaatttccctgtggat 


3680 


3699 1 


3 
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SEQ ID NO: 946 


tccaaagaagtcccggaag 


2409 


2428 


SEQ ID NO: 1949 


cttccaatttccctgtgga 


3679 3698 1 


3 


SEQ ID NO: 947 


agagcctacctccgcatct 


2430 


2449 


SEQ ID NO: 1950 


agattaatccgctggctct 


8563 8582 1 


3 


SEQ ID NO: 948 


gagcctacctccgcatctt 


2431 


2450 


SEQ ID NO: 1951 


aagattaatccgctggctc 


8562 8581 1 


3 


SEQ ID NO: 


949 


cttgggagaggagcttggt 


2447 


2466 


SEQ ID NO: 1952 


accactgggacctaccaag 


12519 12538 1 


3 


SEQ ID NO: 


950 


ggagcttggttttgccagt 


2456 


2475 


SEQ ID NO: 1953 


actggtggcaaaaccctcc 


2726 2745 1 


3 


SEQ ID NO: 


951 


ttggttttgccagtctcca 


2461 


2480 


SEQ ID NO: 1954 


tggagaagccacactccaa 


10763 10782 1 


3 


SEQ ID NO: 


952 


cagtctccatgacctccag 


2471 


2490 


SEQ ID NO: 1955 


ctggtcgcctgccaaactg 


3530 3549 1 


3 


SEQ ID NO: 


953 


ctccatgacctccagctcc 


2475 


2494 


SEQ ID NO: 1956 


ggagtcattgctcccggag 


2664 2683 1 


3 


SEQ ID NO: 


954 


ctgggaaagctgcttctga 


2493 


2512 


SEQ ID NO: 1957 


tcagaaagctaccttccag 


7931 7950 1 


3 


SEQ ID NO: 


955 


gaggtcatcaggaagggct 


2553 


2572 


SEQ ID NO: 1958 


agccagaagtgagatcctc 


3506 3525 1 


3 


SEQ ID NO: 


956 


aagaatgacttttttcttc 


2574 


2593 


SEQ ID NO: 1959 


gaaggcatctgggagtctt 


3827 3846 1 


3 


SEQ ID NO: 


957 


cttttttcttcactacatc 


2582 


2601 


SEQ ID NO: 1960 


gatgcttacaacactaaag 


6099 6118 1 


3 


SEQ ID NO: 


958 


catcttcatggagaatgcc 


2597 


2616 


SEQ ID NO: 1961 


ggcacttccaaaattgatg 


10710 10729 1 


3 


SEQ ID NO: 


959 


cttcatggagaatgccttt 


2600 


2619 


SEQ ID NO: 1962 


aaagttaattgggaagaag 


12273 12292 1 


3 


SEQ ID NO: 


960 


aatgcctttgaactcccca 


2610 


2629 


SEQ ID NO: 1963 


tgggctggcttcagccatt 


5729 5748 1 


3 


SEQ ID NO: 


961 


gcctttgaactccccactg 


2613 


2632 


SEQ ID NO: 1964 


cagtctgaacattgcaggc 


5375 5394 1 


3 


SEQ ID NO: 


962 


caaggctggagtaaaactg 


2684 


2703 


SEQ ID NO: 1965 


cagtgcaacgaccaacttg 


5072 5091 1 


3 


SEQ ID NO: 


963 


tggagtaaaactggaagta 


2690 


2709 


SEQ ID NO: 1966 


tactccaacgccagctcca 


3051 3070 1 


3 


SEQ ID NO: 


964 


ggaagtagccaacatgcag 


2702 


2721 


SEQ ID NO: 1967 


ctgccatctcgagagttcc 


4098 4117 1 


3 


SEQ ID NO: 


965 


tttgtgacaaatatgggca 


2757 


2776 


SEQ ID NO: 1968 


tgcctttgtgtacaccaaa 


11228 11247 1 


3 


SEQ ID NO: 


966 


tgtgacaaatatgggcatc 


2759 


2778 


SEQ ID NO: 1969 


gatgggtctctacgccaca 


4377 4396 1 


3 


SEQ ID NO: 


967 


ggacttcgctaggagtggg 


2786 


2805 


SEQ ID NO: 1970 


cccaaggccacaggggtcc 


12333 12352 1 


3 


SEQ ID NO: 


968 


gtggggtccagatgaacac 


2800 


2819 


SEQ ID NO: 1971' 


gtgttctagacctctccac 


4171 4190 1 


3 


SEQ ID NO: 


969 


ttccacgagtcgggtctgg 


2826 


2845 


SEQ ID NO: 1972 


ccagaatctgtaccaggaa 


12554 12573 1 


3 


SEQ ID NO: 


970 


agtcgggtctggaggctca 


2833 


2852 


SEQ ID NO: 1973 


tgagaactacgagctgact 


4799 4818 1 


3 


SEQ ID NO: 


971 


tcgggtctggaggctcatg 


2835 


2854 


SEQ ID NO: 1974 


catgaaggccaaattccga 


7631 7650 1 


3 


SEQ ID NO: 


972 


aaaagctgggaagctgaag 


2861 


2880 


SEQ ID NO: 1975 


cttccagacacctgatttt 


7943 7962 1 


3 


SEQ ID NO: 


973 


aagctgaagtttatcattc 


2871 


2890 


SEQ ID NO: 1976 


gaatttacaattgttgctt 


6261 6260 1 


3 


SEQ ID NO: 


974 


g agaccagtcaagctgctc 


2900 


2919 


SEQ ID NO: 1977 


gagcttcaggaagcttctc 


13206 13225 1 


3 


SEQ ID NO: 


975 


gcaacacattacatttggt 


2926 


2945 


SEQ ID NO: 1978 


accagtcagatattgttgc 


10183 10202 1 


3 


SEQ ID NO: 


976 


acattacatttggtctcta 


2931 


2950 


SEQ ID NO: 1979 


tagaatatgaactaaatgt 


11881 11900 1 


3 


SEQ ID NO: 


977 


cattacatttggtctctac 


2932 


2951 


SEQ ID NO: 1980 


gtagctgagaaaatcaatg 


7098 7117 1 


3 


SEQ ID NO: 


978 


aaacggaggtgatcccacc 


2956 


2975 


SEQ ID NO: 1981 


ggtggataccctgaagttt 


3197 3216 1 


3 


SEQ ID NO: 


979 


attgagaacaggcagtcct 


2979 


2998 


SEQ ID NO: 1982 


aggaaaagcgcacctcaat 


12023 12042 1 


3 


SEQ ID NO: 


980 


tgagaacaggcagtcctgg 


2981 


3000 


SEQ ID NO: 1983 


ccagcttccccacatctca 


8333 8352 1 


3 


SEQ ID NO: 


981 


ctgcacctcaggcgcttac 


3035 


3054 


SEQ ID NO: 1984 


gtaagaaaatacagagcag 


6432 6451 1 


3 


SEQ ID NO: 


982 


tccacagactccgcctcct 


3066 


3085 


SEQ ID NO: 1985 


aggacagagccttggtgga 


3184 3203 1 


3 


SEQ ID NO: 


983 


ctgaccggggacaccagat 


3093 


3112 


SEQ ID NO: 1986" 


atctgatgaggaaactcag 


12251 12270 1 


3 
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SEQ ID NO: 984 


tagagctggaactgaggcc 


3112 


3131 


SEQ ID NO 


1987 


ggcctctctggggcatcta 


5136 5155 1 


3 


SEQ ID NO: 985 


ctatgagctccagagagag 


3167 


3186 


SEQ ID NO 


1988 


ctctcacaaaaaagtatag 


6541 6560 1 


3 


SEQ ID NO: 986 


cttggtggataccctgaag 


3194 


3213 


SEQ ID NO 


1989 


ctteaggaagcttctcaag 


13209 13228 1 


3 


SEQ ID NO: 987 


ttgtaactcaagcagaagg 


3214 


3233 


SEQ ID NO 


1990 


ccttacacaataatcacaa 


9522 9541 1 


3 


SEQ ID NO: 988 


taactcaagcagaaggtgc 


3217 


3236 


SEQ ID NO 


1991 


gcacctagctggaaagtta 


6947 6966 1 


3 


SEQ ID NO: 989 


gcagaaggtgcgaagcags 


3225 


3244 


SEQ ID NO 


1992 


tctgtgggattccatctgc 


4083 4102 1 


3 


SEQ ID NO: 990 


cagaaggtgcgaagcagac 


3226 


3245 


SEQ ID NO 


1993 


gtctgtgggattccatctg 


4082 4101 1 


3 


SEQ ID NO: 991 


gtatgaccttgtccagtga 


3280 


3299 


SEQ ID NO 


1994 


tcaccaacggagaacatac 


10843 10862 1 


3 


SEQ ID NO: 992 


tatgaccttgtecagtgaa 


3281 


3300 


SEQ ID NO 


1995 


ttcaccaacggagaacata 


10842 10861 1 


3 


SEQ ID NO: 993 


gaagtccaaattccggatt 


3297 


3316 


SEQ ID NO 


1996 


aatctcaagctttctcttc 


10044 10063 1 


3 


SEQ ID NO: 994 


gagggcaaaacgtcttaca 


3363 


3382 


SEQ ID NO 


1997 


tgtacaactggtccgcctc 


4207 4226 1 


3 


SEQ ID NO: 995 


agggcaaaacgtcttacag 


3364 


3383 


SEQ ID NO 


1998 


ctgttaggacaccagccct 


4054 4073 1 


3 


SEQ ID NO: 996 


gactcaccctggacattca 


3382 


3401 


SEQ ID NO 


1999 


tgaaattcaatcacaagtc 


9068 9087 1 


3 


SEQ ID NO: 997 


ctggacattcagaacaaga 


3390 


3409 


SEQ ID NO 


2000 


tcttttcttttcagcccag 


9218 9237 1 


3 


SEQ ID NO: 998 


tcatgggcgacctaagttg 


3427 


3446 


SEQ ID NO 


2001 


caactgcagacatatatga 


6627 6646 1 


3 


SEQ ID NO: 999 


tgggcgacctaagttgtga 


3430 


3449 


SEQ ID NO 


2002 


tcactccattaacctccca 


6308 6327 1 


3 


SEQ ID NO: 1000 


agttgtgacacaaaggaag 


3441 


3460 


SEQ ID NO 


2003 


cttcttttccaattgaact 


13830 13849 1 


3 


SEQ ID NO: 1001 




3446 


3465 


SEQ ID NO 


2004 


tcttcatcttcatctgtca 


10212 10231 1 


3 


SEQ ID NO: 1002 


gacacaaaggaagaaaga 


33447 


3466 


SEQ ID NO 


2005 


ttcttcatcttcatctgtc 


10211 10230 1 


3 


SEQ ID NO: 1003 


ggaagaaagaaaaatcaag 


3455 


3474 


SEQ ID NO 


2006 




11340 11359 1 


3 



2oo7 aaaa a9cgatggccgggtc 3947 3966 



2009tgaacaagaacagtttgaa 
2010agtttgaaaattgagattc 
201 1 gtttgaaaattgagattcc 
201 2ttgaaaattgagattcctt 
201 3ctaaagatgttagagactg 
2014atgttagagactgttagga 
2015 cagccctccacttcaagtc 
201 6agccctccacttcaagtct 
201 7ccatctgccatctcgagag 
201 8attcccaagttgtatcaac 

20 1 9 tcaactgcaagtgcctctc 

2020 ggtgttctagacctctcca 

2021 ctccacgaatgtctacagc 

2022 cacgaatgtctacagcaac 

2023 acgaatgtctacagcaact 
2024tcctacagtggtggcaaca 

2025 cgttaccacatgaaggctg 

2026 gaaggctgactctgtggtt 
2027tgtggttgacctgctttcc 



'SEQ ID NO 

3963 3982SEQ ID NO 

3976 3995SEQ ID NO 

3987 4006SEQ ID NO 

3988 4007SEQ ID NO 
3990 4009SEQ ID NO 
4038 4057SEQ ID NO: 
4044 4063SEQ ID NO: 

4066 4085SEQ ID NO: 

4067 4086SEQ ID NO 
4094 4113SEQIDNO: 
4134 4153SEQIDNO: 
4148 4167SEQ ID NO: 
4170 4189SEQ ID NO: 
4184 4203SEQIDNO 

4187 4206SEQ ID NO 

4188 4207SEQ ID NO 
4224 4243SEQ ID NO 
4272 4291SEQIDNO 
4283 4302SEQ ID NO. 
4295 4314SEQ ID NO: 
4304 4323SEQ ID NO 



231 3gaccttgcaagaatatttt 
23 1 4tgttaacaaattccttgac 
231 5ttcaagttcctgaccttca 
2316gaatctggctccctcaact 
2317ggaaataccaagtcaaaac 
231 Baaggaaaagcgcacctcaa 
23 1 9 cag ttg accacaagcttag 
2320tccttaacaccttccacat 
2321 gacttctctagtcaggctg 
2322agacatcgctgggctggct 
2323ctctcaaatgacatgatgg 



2325gagatcaagacactgttga 
2326tggaaccctctccctcacc 
2327gctggtaacctaaaaggag 
2328gttgcccaccatcatcgtg 



2330tgttagtlgctcttaagga 
2331 cagcaagtacctgagaac 
2332aacctatgccttaatcttc 



2334cacaccttgacattgcagg 



6335 6354 1 
7355 7374 1 
8302 8321 1 
9039 9058 1 
1044610465 1 
1202212041 1 
1053710556 1 
8065 8084 1 
8805 8824 1 
5720 5739 1 
5322 5341 1 
6246 6265 1 
8835 8854 1 
4727 4746 1 
5580 5599 1 
1166311682 1 
1166211681 1 
1335113370 1 
8603 8622 1 
1316113180 1 
6957 6976 1 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



2029 ctgctttcctacaatgtgc 



2031 tatgaccacaagaatacgt 

2032 atgaccacaagaatacgtc 



2034tttctagattcgaatatca 



2037 cccagtctcaaaaggttta 
2038 ctcaaaaggtttactaata 
2039tcaaaaggtttactaatat 
2040aaaaggtttactaatattc 

2041 gaaacagcatttgtttgtc 

2042 atttgtttgtcaaagaagt 
2043tcaagattgatgggcagtt 
2044ttcagagtctcttcgttct 



2046atgctaaaggcacatatgg 
2047gcacatatggcctgtcttg 
2048gagtccaacctgaggttta 
2049agtccaacctgaggtttaa 



2051 gaagatggaaccctctccc 
2052tgatctgcaaagtggcatc 



2054gcttccctaaagtatgaga 



2056tctaacaagatggatatga 

2057ctgctgcgttctgaatatc 

2058tcattgaggttcttcagcc 

2059ttctggatcactaaattcc 

2060ccatggtcttgagttaaat 

2061 tcttaggcactgacaaaat 

2062acaaggcgacactaaggat 

2063tgcaacgaccaacttgaag 

2064 caacttgaagtgtagtctc 




2070ctcacagagctatcactgg 

2071 tgggaagtgcttatcaggc 

2072ttcaaggtcagtcaagaag 

2073aatgacatgatgggctcat 

2074gctcatatgctgaaatgaa 

2075 atatgctgaaatgaaattt 

2076tctgaacattgcaggctta 

2077gaacattgcaggcttatca 

2078tgcaggcttatcactggac 



4305 
4311 
4344 
4345 



4404 
4441 
4448 
4454 
4455 
4457 
4535 
4543 
4561 
4578 
4580 
4597 
4606 
4659 
4660 
4684 
4722 
4754 
4755 
4785 
4796 



4324SEQ ID NO: 
4330SEQ ID NO: 
4363SEQ ID NO: 
4364SEQ ID NO: 
4374SEQ ID NO: 
4417SEQID NO: 
4423 SEQ ID NO: 
4460SEQ ID NO: 
4467SEQ ID NO: 
4473SEQ ID NO: 
4474SEQ ID NO: 
4476SEQ ID NO: 
4554SEQ ID NO: 
4562SEQ ID NO: 
4580SEQ ID NO: 
4597SEQ ID NO: 
4599SEQ ID NO: 
4616SEQ ID NO: 
4625 SEQ ID NO: 
4678SEQ ID NO: 
4679SEQ ID NO: 
4703SEQ ID NO: 
4741SEQIDNO: 
4773SEQ1D NO: 
4774SEQ ID NO: 
4804SEQ ID NO: 
4815SEQ ID NO: 
4879SEQ ID NO: 
4918SEQIDNO: 
4951 SEQ ID NO: 
4974SEQ ID NO: 
4992SEQ ID NO: 
5018SEQIDNO: 
5051 SEQ ID NO: 
5094SEQ ID NO: 
5103SEQ ID NO: 
5127SEQ ID NO: 
5146SEQ ID NO: 
5159SEQIDNO: 
5161SEQIDNO: 
5204SEQ ID NO: 
5242SEQ ID NO: 
5258SEQ ID NO: 
5314SEQ ID NO: 
5347SEQ ID NO: 
5360SEQ ID NO: 
5364SEQ ID NO: 
5397SEQ1D NO: 
5400SEQ ID NO: 
5406SEQ ID NO: 



2335 gcaoaccttgacattgcag 
2336atccgctggctctgaagga 
2337 acgtccgtgtgccttcata 
2338 gacgtccgtgtgccttcat 
2339tgattatctgaattcattc 
2340tgatttacatgatttgaaa 
2341 tgaagtagctgagaaaatc 
2342tttgaaaaattctcttttc 
2343taaattcattactcctggg 



2345 atattcaaaactgagttga 
2346 gaatttgaaagttcgtttt 
2347gacagcatcttcgtgtttc 
2348 acttaaaaaatataaaaat 
2349aactctcaagtcaagttga 



2351 atagcatggacttcttctg 

2352ccatttgagatcacggcat 

2353caagttggcaagtaagtgc 

2354taaagtgccacttttactc 

23551taacagggaagatagact 

2356ttggcaaglaagtgctagg 




2364ggctcatatgctgaaatga 



2366atttttattcctgccatgg 
2367attttttgcaagttaaaga 
2368 atccatgatctacatttgt 



2370gagatgagagatgccgttg 
2371attctcttttcttttcagc 
2372 cagatacaagaaaaactgc 
2373ttcattcaattgggagaga 
2374atttgtaagaaaatacaga 
2375 ctgaagcattaaaactgtt 



2377gcctacgttccatgtccca 
2378cttcagtgcagaatatgaa 
2379atgattatctgaattcatt 
2380ttcagccattgacatgagc 
2381 aaatagctattgctaatat 
2382taagaaccagaagatcaga 
2383tgatatcgacgtgaggttc 
2384 gtcctggattccacatgca 



1107911098 1 3 

8569 8588 1 3 

9976 9995 1 3 

9975 9994 1 3 

6479 6498 1 3 

6677 6696 1 3 

7094 7113 1 3 

9206 9225 1 3 

1129411313 1 3 

1222312242 1 3 

1222212241 1 3 

9272 9291 1 3 

1120611225 1 3 

8014 8033 1 3 

1341413433 1 3 

1198712006 1 3 

8865 8884 1 3 

9237 9256 1 3 

9364 9383 1 3 

6182 6201 1 3 

9300 9319 1 3 

9368 9387 1 3 

1228312302 1 3 

1225512274 1 3 

1225412273 1 3 

5969 5988 1 3 

6912 6931 1 3 

1302413043 1 3 

6893 6912 1 3 

5340 5359 1 3 

1254112560 1 3 

1009510114 1 3 

1401114030 1 3 

6786 6805 1 3 

5177 5196 1 3 

6231 6250 1 3 

9214 9233 1 3 

6891 6910 1 3 

6491 6510 1 3 , 

6428 6447 1 3 

7498 7517 1 3 

8141 8160 1 3 

1134811367 1 3 

1196911988 1 3 

6478 6497 1 3 

5738 5757 1 3 

6694 6713 1 3 

1098811007 1 3 

1248212501 1 3 

1184411863 1 3 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



2079tcaaaacttgacaacattt 
2080 a 
2081 C 




2 1 07 cttggatgcttacaaoaot 
21 08ttggcgtggagcttactgg 
2109 cacttttactcagtgagco 



2ingi 

2112caattgttgcttttgtaaa 



21 1 5ttcactcoattaacctccc 
21 16ttttgagaccttgcaagaa 
21 1 7accttgcaagaatattttg 
21 



21 

21 20cctgggaaaactcccacag 
21 21 actcccacagcaagctaat 
21 22aattcattcaattgggaga 
2123ttcaattgggagagacaag 
2124aggagaaactgactgctct 
2125actgactgctctcacaaaa 



2127cagacatatatgatacaat 



5412 5431 SEQ ID NO: 

5427 5446SEQ1DNO: 

5435 5454SEQIDNO: 

5460 5479SEQIDNO: 

5483 5502SEQ1DNO: 

5588 5607SEQIDNO: 

5591 5610 SEQ !D NO: 

5594 561 3 SEQ ID MO: 

5608 5627 SEQ ID NO: 

5618 5637SEQIDNO: 

5678 5697SEQ ID NO: 

5697 5716SEQ ID NO: 

5732 5751 SEQ ID NO: 

5782 580'iSEQ ID NO: 

5783 5802SEQ ID NO: 

5784 5803SEQ ID NO: 
5786 5805SEQ ID NO: 

5792 5811 SEQ ID NO: 

5793 5812SEQ ID NO: 
5851 5870SEQ ID NO: 
5871 5890 SEQ ID NO 
5906 5925SEQ ID NO 
5975 5994 SEQ ID NO 
5985 6004SEQ ID NO. 
6001 6020SEQ ID NO: 
6038 6057SEQ ID NO: 
6053 6072 SEQ ID NO: 
6076 6095SEQ ID NO: 
6095 6114SEQID NO: 
6124 6143SEQ ID NO: 
6190 6209SEQ ID NO: 
6227 6246SEQIDNO: 
6249 6268SEQ ID NO: 
6268 6287SEQ ID NO: 
6278 6297SEQ ID NO: 
6280 6299SEQ ID NO. 
6307 6326SEQ ID NO: 
6329 6348SEQ ID NO 
6336 6355SEQ ID NO. 
6415 6434SEQ ID NO: 
6443 6462SEQ ID NO 
6452 6471SEQIDNO. 
6461 6480SEQ ID NO: 
6489 6508 SEQ ID NO 
6495 6514SEQ ID NO. 
6526 6545SEQ ID NO: 
6533 6552SEQ ID NO 
6536 6555SEQ ID NO. 
6633 6652SEQ ID NO: 
6649 6668SEQ ID NO 




2401 agaagtgtcttcaaagctg 

2402 cattcaattgggagagaca 
2403ccattcagtctctcaagac 



2405gctgltttgaagactctcc 
2406 cagaattcataatcccaao 
2407actgcaagatttttcagac 
2408caagaacctgttagttgct 



241 1 aaatcccatccaggttttc 
24l2tcctttggctgtgctttgt 
241 3agtgaagttetecagcaag 
2414ccagaattcataatcccaa 

24 1 5 ggctattgatgttagagtg 

24 1 6 ggcatgatgctcatttaaa 

24 1 7 taaagccattcagtctctc 
2418tttaaccagtcagatattg 
24 1 9 tttattgctgaatccaaaa 



2421 gggaaaaaacaggcttgaa 



2424acaaagcagattatgttga 
2425ttttcagaccaactctctg 
2426ctgtctctggtcagccagg 
2427attacacttcctttcgagt 



2429 ( 



2431 ttl 
2432ctttg1gagtttatcagtc 



2434ttaaaagaaatcttcaatt 



7362 7381 1 3 

8014 8033 1 3 

1066610685 1 3 

5570 5589 1 3 

7267 7286 1 3 

9368 9387 1 3 

6263 6282 1 3 

1011910138 1 3 

7057 7076 1 3 

1215012169 1 3 

9079 9098 1 3 

7442 7461 1 3 

8588 8607 1 3 

8620 8639 1 3 

8619 8638 1 3 

8618 8637 1 3 

1240412423 1 3 

6493 6512 1 3 

1296712986 1 3 

1220512224 1 3 

1080 1099 1 3 

8266 8285 1 3 

1360413623 1 3 

1334313362 1 3 

6410 6429 1 3 

1160211621 1 3 

8029 8048 1 3 

9674 9693 1 3 

8591 8610 1 3 

8265 8284 1 3 

6980 6999 1 3 

9169 9188 1 3 

1296212981 1 3 

1017910198 1 3 

1364713666 1 3 

6350 6369 1 3 

9568 9587 1 3 

9558 9577 1 3 

1294012959 1 3 

1182111840 1 3 

1361413633 1 3 

7716 7735 1 3 

1286112880 1 3 

1047110490 1 3 

1180011819 1 3 

1115511174 1 3 

8372 8391 1 3 

9687 9706 1 3 

1925 1944 1 3 

1380713826 1 3 
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21 30tttgaaaatagctattgct 
21 31 ttgaaaatagctattgcta 



21 33attattgatgaaatcattg 
2 1 34aaagtcttgatgagcacta 
21 35aagtcttgatgagcactat 
21 36ttgatgagcactatcatat 



21 38ttttagtaaaaacaatcca 



2140 attgattttaacaaaagtg 
214iattttaacaaaagtggaag 



2146tgagcatgtcaaacacttt 
2147gagcatgtcaaacactttg 

2149tgagaaaatcaatgccttc 
21 50tatgaagtagaccaacaaa 

2151 aagtagaccaacaaatcoa 

2 1 52 aagttgaaggagactattc 

2153 acaagttaagataaaagat 
21 54aagataaaagattactttg 
2155gattactttgagaaattag 
21 56tgagaaattagttggattt 
2157aaattagttggatttattg 
2158tggatttattgatgatgct 

21 59tcattgaagatgttaacaa 
21 60cattgaagatgttaacaaa 

21 62ttgaagatgttaacaaatt 
21 63tgaagatgttaacaaattc 
21 64acatgttgataaagaaatt 
2 1 65ttlgattaccaccagtttg 



21 67aaaatccgtgaggtgactc 
21 68aggtgactcagagactcaa 



21 70gttgcagtgtatctggaaa 
21 71 ttaagttcagcatcttfgg 
21 72tgaaggccaaattccgaga 
2 1 73 aatgtatoaaatggacatt 
21 74attcagcaggaacttcaac 
21 75acctgtctctggtcagcca 
21 76cctgtctctggtcagccag 
2 1 77ggtcagccaggtttatagc 
21 78ccaggtttatagcacactt 



6675 6694SEQ ID NO: 

6689 6708SEQ ID NO: 

6690 6709SEQ ID NO: 
6695 6714SEQ ID NO: 
6711 6730 SEQ ID NO: 

6739 6758 SEQ ID NO: 

6740 6759SEQIDNO: 
6745 6764SEQ ID NO: 
6769 6788 SEQ ID NO: 
6772 6791 SEQ ID NO: 
6797 6816SEQ ID NO: 
6816 6835SEQIDNO: 
6820 6839 SEQ ID NO: 
6880 6899SEQ ID NO: 

6905SEQ ID NO: 
1 ID NO: 

6942 6961SEQIDNO: 

7052 7071SEQIDNO: 

7053 7072 SEQ ID NO: 
7062 7081SEQIDNO: 
7103 7122SEQIDNO: 
7152 7171SEQIDNO: 
7156 7175SEQIDNO: 
7215 7234SEQIDNO: 
7256 7275SEQIDNO: 
7263 7282SEQIDNO: 
7272 7291SEQIDNO: 
7280 7299SEQIDNO: 
7284 7303SEQIDNO: 
7292 7311SEQIDNO: 

7345 7364SEQ ID NO: 

7346 7365SEQIDNO: 

7347 7366SEQIDNO: 

7348 7367SEQIDNO: 

7349 7368SEQIDNO: 
7372 7391SEQIDNO: 
7398 7417SEQIDNO: 

7433 7452SEQ ID NO: 

7434 7453SEQIDNO: 
7444 7463SEQIDNO: 
7465 7484SEQIDNO: 
7539 7558SEQIDNO: 
7608 7627SEQIDNO: 
7633 7652 SEQ ID NO: 
7676 7695SEQ ID NO: 
7692 7711SEQIDNO: 

7714 7733SEQIDNO: 

7715 7734SEQIDNO: 
7724 7743SEQIDNO: 
7730 7749SEQIDNO: 



2435tcaalgattatatcccata 



2439caataccagaattcataat 

2440tagtgattacacttccttt 

2441 atagcaacactaaatactt 

2442 atatccaagatgag atcaa 

2443attgagattccctccatta 

2444tggagtgccagtttgaaaa 

2445atttcctaaagctggatgt 



2448tgtaccataagccatattt 
2449ttttctaaacttgaaattc 



2451 ttccaatttccctgtggat 
2452 aaagtgccacttttactca 



2454gattatatcccatatgttt 



2456tttgtggagggtagtcata 
2457tggatgaagatgacgaclt 



2460caaaatagaagggaatott 

2462aaatccgtgaggtgactca 

2463caattttgagaatgaattt 

2464agc3tgcctagtttctcca 

2465ltgtagatgaaaccaatga 

2466 tttgtagatgaaaccaatg 

2467atttaagtatgatttcaat 

2468aatttaagtatgatttcaa 

2469gaatttaagtatgatttca 



2473gagtgaaatgctgtttttt 
24741tgatgatatctggaacct 

2476 tttcaagcaaatgcacaac 
2477 ccaatgctgaactttttaa 
24781ctcctttcttcatcttca 
2479 aatgaagtccggattcatt 
2480gttgagaagccccaagaat 

2481 tggcaagtaagtgctaggt 

2482 ctggacttctctagtcagg 



1312013139 1 3 

1385613875 1 3 

1385513874 1 3 

1407614095 1 3 

8260 8279 1 3 

1285612875 1 3 

8761 8780 1 3 

1309313112 1 3 

1169411713 1 3 

1180211821 1 3 

1116711186 1 3 

9863 9882 1 3 

8006 8025 1 3 

1008010099 1 3 

9057 9076 1 3 

9483 9502 1 3 

3680 3699 1 3 

6183 6202 1 3 

5326 5345 1 3 

1312513144 1 3 

1202112040 1 3 

1032310342 1 3 

1214812167 1 3 

1016010179 1 3 

1132611345 1 3 

2069 2088 1 3 

9061 9080 1 3 

7435 7454 1 3 

1041110430 1 3 

9945 9964 1 3 

7414 7433 1 3 

7413 7432 1 3 

1048710506 1 3 

1048610505 1 3 

1048510504 1 3 

1147911498 1 3 

8783 8802 1 3 

7964 7983 1 3 

8630 8649 1 3 

1072310742 1 3 

8401 8420 1 3 

8532 8551 1 3 

1016510184 1 3 

1020510224 1 3 

1101311032 1 3 

6246 6265 1 3 

9369 9388 1 3 

8802 8821 1 3 



2484aagtccggattcattctgg 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



21 79gtttatagcacacttgtca 



21 81 ctgattggtggactcttgc 

211 



21 84gggttcactgttcctgaaa 
21 85tcaagaccatccttgggac 
21 86ccttgggacGatgcctgcc 
21 87ttcaggctcttcagaaagc 



2191 g; 

2 1 92 cattccttcctttacaatt 

2193 ttgaccagatgctgaacag 
21 94aatcaccctgccagacttc 
21 95tgaccttcacataccagaa 
21 96ttccagcttccccacatct 

2 1 97 aagctatacagtattctga 
21 
21 

2200caaatgctgacatagggaa 
2201 gagagtccaaattagaagt 
2202agagtccaaattagaagtt 
2203tctcaattttgattttcaa 



2205 aatgcacaactctcaaacc 
2206agttctccagcaagtacct 
2207agtacctgagaacggagca 



221 octggatagcaacactaaat 
221 1 ctgacctgcgcaacgagat 
2212agatgagggaacacatgaa 
221 3tcaacttttctaaacttga 
2214ttctaaacttgaaattcaa 
2215gaaattcaatcacaagtcg 



2217actgtltggagaagggaag 

221 8 aattctcttttcttttcag 

2219 ttcttttcagcccagccat 
2220tttgaaagttcgttttcca 
2221 cagggaagatagacttcct 



2225 agoaaatctggatttctta 
2226 tcctttaacaattcctgaa 
2227tttaacaattcctgaaatg 



7734 7753SEQIDNO: 
7745 7764SEQIDNO: 
7762 7781SEQ1DNO: 

7839 7858SEQIDNO: 

7840 7859SEQ1DNO: 
7860 7879SEQIDNO: 
7879 7898SEQIDNO: 
7889 7908SEQIDNO: 
7921 7940SEQIDNO: 
7996 8015SEQ ID NO: 
8005 8024SEQ1DNO: 
8031 8050SEQIDNO: 
8055 8074SEQ ID NO: 
8081 81C0SEQ ID NO: 
8137 8156SEQIDNO: 
8225 8244SEQIDNO: 
8312 8331SEQIDNO: 
8331 8350SEQ ID NO: 
8379 8398SEQIDNO: 
8391 8410SEQ ID NO: 
8414 8433SEQ ID NO: 
8428 8447 SEQ ID NO: 

8500 8519SEQ ID NO: 

8501 8520SEQ ID NO: 
8519 8538SEQIDNO: 
8522 8541SEQIDNO: 
8541 8560SEQ ID NO: 
8596 8615SEQIDNO: 
8608 8627SEQ ID NO: 
8670 8689SEQIDNO: 
8743 8762SEQIDNO: 
8757 8776SEQ ID NO: 
8821 8840SEQ ID NO: 
8921 8940SEQ ID NO: 
9052 9071 SEQ ID NO: 
9059 9078SEQIDNO: 
9069 9088SEQ ID NO: 

9133 9152SEQIDNO: 

9134 9153SEQIDNO: 
9213 9232SEQIDNO: 
9222 9241SEQIDNO: 
9275 9294SEQIDNO: 
9304 9323SEQIDNO: 
9397 9416SEQ ID NO: 
9427 9446SEQiDNO: 
9455 9474 SEQ ID NO: 
9470 9489SEQ ID NO: 
9494 9513SEQIDNO: 
9497 9516SEQIDNO: 
9526 9545SEQIDNO: 



2485tgacctgtccattcaaaac 



2488gctaatctcctttcttcat 
2489tgctcatctcctttcttca 



2491 gtccccctaacagatttga 



2497tgttgaagtgtctccattc 
2498 aattccaattttgagaatg 



2500 gaagttctcaattttgatt 
2501 th 



2503tcagatggcattgctgctt 
2504 gagataacogtgcctgaat 



2506ttccatcacaaatcctttg 
2507 aotttaottcccaactctc 
2508 aactttacttcccaactct 
2509ttgattcccttttttgaga 



251 1 ggtttatcaaggggccatt 
251 2aggttccatogtgcaaact 
251 3tgctccaggagaacttact 
2514 aactctcaagtcaagttga 
2515tcoattctgaatatattgt 
251 6 attttctgaacttccccag 
251 7atctgatgaggaaactcag 



251 9tcaaggataacgtgtttga 
2520ttgatgatgctgtcaagaa 
2521 cgacgaagaaaataatttc 
2522 ttccagaaagcagccagtg 

2524ctgattactatgaaaaatt 




2533catttgatttaagtgtaaa 
2534 ggag acagcatcttcgtgt 



1367313692 1 3 

1027510294 1 3 

1401814037 1 3 

1020010219 1 3 

1019910218 1 3 

8951 8970 1 3 

7965 7984 1 3 

1397013989 1 3 

9580 9599 1 3 

1317513194 1 3 

6621 6840 1 3 

9464 9483 1 3 

9881 9900 1 3 

1040610425 1 3 

1292412943 1 3 

8514 8533 1 3 

8876 8895 1 3 

8913 8932 1 3 

1160411623 1 3 

1154411563 1 3 

9730 9749 1 3 

9662 9681 1 3 

1340213421 1 3 

1340113420 1 3 

1152911548 1 3 

1365213671 1 3 

1245212471 1 3 

1138011399 1 3 

1377213791 1 3 

1341413433 1 3 

1337213391 1 3 

1269412713 1 3 

1225112270 1 3 

1003010049 1 3 

1261012629 1 3 

7300 7319 1 3 

1355813577 1 3 

1249812517 1 3 

2890 2909 1 3 

1363013649 1 3 

1348613505 1 3 

1037210391 1 3 

9840 9859 1 3 

1403014049 1 3 

1337213391 1 3 

1172611745 1 3 

1171111730 1 3 

1319813217 1 3 

9613 9632 1 3 

1120311222 1 3 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



2229aagatttctctctatggga 
2230gaaaaaacaggcttgaagg 



2232tgaaggaattcttgaaaac 
2233 agctcagtataagaaaaac 
2234tcaaatcctttgacaggca 
2235atgaaacaaaaattaagtt 
2236aattcctggatacactgtt 



2238aagtgtctccattcaccat 



2240ctgccatgggcaatattac 

2241 tgaataccaatgctgaact 

2242tattgttgctcatctcctt 

2243tgttgctcatctcctttct 

2244tctgtcattgatgcactgc 

2245ccacagctctgtctctgag 

2246atttgtggagggtagtcat 

2247atatggaagtgtcagtggc 

2248tggaaataccaagtcaaaa 

2249aagtcaaaacctactgtct 

2250actgtctcttcctccatgg 

2251 cttcctccatggaatttaa 

2252 attcttcaatgctgtactc 

2253ttgaccacaagcttagctt 



2257gagaacatacaagcaaagc 

2258atggcaaatgtcagctctt 

2259tggcaaatgtcagctcttg 

2260ttgttcaggtccatgcaag 

2261 tgttcaggtccatgcaagt 

2262agttccttccatgatttcc 



2264 actaagaaccagaagatca 



2266 cagaagatcagatggaaaa 
2267 aaaaatgaagtccggattc 



2269 aagaaaaggcacaccttga 
2270 aaggacacctaaggttcct 
2271 ccagcattggtaggagaca 
2272 ctttgtgtacaccaaaaac 
2273ccatccctgtaaaagtttt 
2274tgatctaaattcagttctt 
2275aagaagctgagaacttcat 
2276 tttgccctcaacctaccaa 
2277cttgattcccttttttgag 
2278 ttcacgcttccaaaaagtg 



9553 9572SEQ1DNO 
9570 9589SEQIDNO: 

9582 9601 SEQ ID NO: 

9583 9602SEQIDNO: 
9632 9651SEQIDNO. 
9712 9731SEQIDNO: 
9781 9800SEQIDNO: 
9851 9870SEQIDNO 
9868 9887SEQIDNO. 
9886 9905SEQIDNO: 
9942 9961SEQIDNO: 

10105 10124SEQ ID NO: 
10159 10178SEQ ID NO: 
10193 10212SEQ ID NO: 
10196 10215SEQIDNO: 
10224 10243SEQ ID NO: 
10297 10316SEQ ID NO: 
10322 10341SEQ ID NO: 
10369 10388SEQID NO: 
10445 10464SEQ ID NO: 
10455 10474SEQ ID NO: 
10467 10486SEQ ID NO: 
10474 10493SEQ ID NO: 
10504 10523SEQ ID NO 
10540 10559SEQ ID NO. 
10565 10584SEQ ID NO: 
10702 10721SEQ ID NO 
10715 10734SEQ ID NO. 
10852 10871 SEQ ID NO: 

10889 10908SEQ ID NO: 

10890 10909SEQ ID NO. 

10906 10925SEQ ID NO: 

10907 10926SEQ ID NO: 
10932 10951SEQ ID NO: 
10979 10998SEQIDNO: 

10986 11005SEQ ID NO: 

10987 11006SEQ ID NO: 
10995 11014SEQ ID NO: 
11010 11029SEQ ID NO: 
11024 11043SEQ ID NO: 
11071 11090SEQ ID NO: 
11107 11 126SEQ ID NO: 
11191 11210SEQ ID NO 
11231 11250SEQ ID NO. 
11269 11288SEQ ID NO: 
11324 11343SEQ ID NO 
1142411443SEQIDNO: 
11445 11464SEQ !D NO: 
11528 11547 SEQ ID MO: 
11583 11602SEQ ID NO: 



2535tcccagaaaacctcttctt 

2536ccttttacaattaattttc 

2537ttttgagaatgaatttcaa 



2540tgcctgagcagaccattga 
2541 aactttgcactatgttcat 
2542 aacacatgaatcacaaatt 



2544atgggaagtataagaactt 
2545agaaaaggcacaccttgac 



2547 agttgaaggagactattca 

2548 aaggaaacataaactaata 



2550gcagtagactataagcaga 

2551 ctcagggatctgaaggtgg 

2552 atgaagtagaccaacaaat 

2553 gccacactccaacgcatat 



2555agacctagtgattacactt 
2556 ccatgcaagtcagcccagt 
2557ttaatcgagaggtatgaag 
2558gagttgagggtccgggaat 
2559aagcgcacctcaatatcaa 



2561 ttgggaagaagaggcagct 

2562gatatacactagggaggaa 

2563gcttggttttgccagtctc 

2564aagaggtatttaaagccat 

2565caagaggtatttaaagcca 



2567acttgggggaggaggaaca 
2568ggaatctgatgaggaaact 



2570tgatcaagaacctgttagt 
2571 ctgatcaagaacctgttag 
2572ttttcagaccaaotctctg 
2573gaatttgaaagttcgtttt 



2575tcaaaacctactgtcfctt 
2576aggacaccaaaataacctt 
2577tgtcaacaagtaccactgg 
2578gtttttaaattgttgaaag 



2580aagatagtcagtctgatca 

2581 atgagatcaacacaatctt 

2582 ttggtacgagttactcaaa 
2583ctcaattttgattttcaag 
2584cactcattgattttctgaa 



1301313032 1 3 

1041410433 1 3 

1128311302 1 3 

9797 9816 1 3 

1168011699 1 3 

1275412773 1 3 

8930 6949 1 3 

1319913218 1 3 

4834 4853 1 3 

1107211091 1 3 

6432 6451 1 3 

7216 7235 1 3 

1288112900 1 3 

1242312442 1 3 

1392013939 1 3 

8187 8206 1 3 

7153 7172 1 3 

1077010789 1 3 

1301513034 1 3 

1285112870 1 3 

1091610935 1 3 

7140 7159 1 3 

1223412253 1 3 

1202812047 1 3 

1064110660 1 3 

1228112300 1 3 

1273712756 1 3 

2459 2478 1 3 

1295212971 1 3 

1295112970 1 3 

1405814077 1 3 

1405714076 1 3 

1224812267 1 3 

1117811197 1 3 

1333913358 1 3 

1333813357 1 3 

1361413633 1 3 

9272 9291 1 3 

1315813177 1 3 

1045810477 1 3 

7564 7583 1 3 

1236212381 1 3 
1314013159 1 

8885 8904 1 
1332613345 1 
1310213121 1 
1263312652 1 

8520 8539 1 
1268512704 1 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



2280aatgcagtagccaacaaga 
2281 ctgagcagaccattgagat 
2282tgagcagaccattgagatt 



2284acttggagtgccagtttga 
2285caaatttgaaggacttcag 
2286agcccagogttcaccgatc 
2287cagcgttcaccgatctcca 
2288ctccatctgcgctaccaga 



2290 aggcagottctggcttgct 

2291 tgaaagacaacgtgcccaa 
2292tatgattatgtcaacaagt 



2294ttgactcaggaaggccaag 



2296tcctttcgagttaaggaaa 
2297gccattcagtctctcaaga 



2299agctgaaagagatgaaatt 
2300aatttacttatcttattaa 
2301 ttttaaattgttgaaagaa 
2302taatcttcataagttcaat 



2305caatttctgcacagaaata 
2306agaagattgcagagctttc 
2307gaagaaaataatttctgat 



2309tcaaaactaccacacattt 
231 ottttttaaaagaaatcttc 
231laggatctgagttattttgc 



1160011619SEQID NO: 
11631 11650SEQ ID NO: 

11683 11702SEQ ID NO: 

11684 11 703SEQ ID NO: 
11695 11714SEQ ID NO: 
1179911818SEQID NO: 
11996 12015SEQ ID NO: 
12048 12067SEQ ID NO: 
12052 12071 SEQ ID NO: 
12066 12085SEQ ID NO: 
12256 12275SEQ ID NO: 
12292 12311SEQ ID NO: 
12319 12338SEQ ID NO: 
12354 12373SEQ ID NO: 
12467 12486SEQ ID NO: 
12576 12595SEQ ID NO: 
12728 12747SEQID NO: 
12869 12888SEQ ID NO: 
12966 12985SEQ ID NO: 
12993 13012SEQ ID NO: 
13057 13076 SEQ ID NO: 
13072 13091 SEQ ID NO: 
13142 13161 SEQ ID NO: 
13172 131 91 SEQ ID NO: 
13271 13290SEQ ID NO: 
13303 13322SEQ ID NO: 
13434 13453SEQ ID NO: 
13501 13520SEQ ID NO: 
13562 13581SEQ ID NO: 
13672 13691 SEQ ID NO: 
13585 1 3704SEQ ID NO: 
13803 13822 SEQ ID NO: 
14035 14054SEQ ID NO: 
14049 14068SEQ ID NO: 



2585agcagattatgltgaaaca 

2586tcttttcagcccagccatt 

2587atctgatgaggaaactcag 



2589ttaatcttcataagttcaa 

2590tcaattgggagagacaagt 

2591 etgagaacttcatcatttg 

2592gatccaagtatagttggct 

2593tggacctgcaccaaagctg 

2594tctgatatacatcacggag 

2595ttgagttgcccaccatcat 

2596agcaagtctttcctggcct 

2597ttgggagagacaagtttca 

2598actttgcactatgttoata 

2599atcaacacaatcttcaatg 



2S01 agtgattacacttcctttc 
2602tttctgccactgctcagga 
2603tcttccgttctgtaatggc 



2606ttaaaagaaatcttcaatt 



2610tcaaccttaatgattttca 

26111attcttcttttccaattg 

2612gaaatcttcaatttattct 

2613atcagttcagataaacttc 

2614ttttgagaatgaatttcaa 



2617gcaagggttcactgttcct 



1182511844 1 3 

9223 9242 1 3 

1225112270 1 3 

1225012269 1 3 

1317113190 1 3 

6496 6515 1 3 

1143011449 1 3 

1327813297 1 3 

1395213971 1 3 

1370313722 1 3 

1165911678 1 3 

3010 3029 1 3 

6500 6519 1 3 

1275512774 1 3 

1310713126 1 3 

1263212651 1 3 

1285712876 1 3 

1351613535 1 3 

5794 5813 1 3 

1395613975 1 3 

1319213211 1 3 

1380713826 1 3 

9558 9577 1 3 

1169411713 1 3 

1392913948 1 3 

8287 8306 1 3 

1382613845 1 3 

1381313832 1 3 

7991 8010 1 3 

1041410433 1 3 

7362 7381 1 3 

1037410393 1 3 

7856 7875 1 3 

9834 9853 1 3 



# = Match Number 

B = Middle Matching Bases 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3965aaattctcttttcttttca 



3968tgctaagaaccttactgac 



Table 9. Selected palindromic sequences from human ApoB 

Source Start End Match 

Index Index 

517 536 SE Q| DNO : 
4107 4126 SEQ | DNO : 
7064 7083 SE Q|DNO: 
7076 7095 SE Q|DNO: 
8888 8907 SE Q| DNO : 
10908 10927seq ID NO: 
364 383 SE Q|DNO: 
1089 1108 SE Q] DN o: 
1305 1324 SEQ | DNO : 
2076 2095 SEQ | DNO: 
3628 3647 S EQIDNO: 
4238 4257 SEQ |DNO: 
5958 5977 SEQ | DNO: 
7733 7752 SE Q| DNO : 
1073510754 SEQ |DNO: 
13431 13450 SE Q| DNO : 
13488 13507 S EQ ID NO: 
4551 4570 seq id N Q: 
212 231 SEQ ID NO: 
1737 1756 SE Q|DNO: 
1964 1983SEQIDNO: 
3578 3597SEQIDNO- 
6422 6441 seq ID NO: 
7334 7353 SE Q|DNO: 
138 157 SEQ ID NO: 
279 298 SE Q|DNO: 
290 309 SE Q ID NO: 
456 475 SEQ | D N0: 

566 585 SE Q ID NO: 

567 586 SEQ | D NO: 
623 642 SEQ ,p N0 . 
679 698 SEQ1DNO: 
830 849 SE Q ID NO: 
862 881 seq id NO: 
873 892 SE Q iD NO: 

1095 1-I14 SEQ |DNO: 
1129 1148SEQIDNO: 
1197 1216 SE Q lDNO: 
1217 1236 SEQ | DNO : 
1224 1243 S EQIDNO: 
1286 1305 SE Q| DNO : 
1311 1330 SE Q|DNO: 
1511 1530seQIDNO: 
1746 1765 SE Q, DNO : 
1911 1930 SE q, DNO: 

1954 1973 SEQ |DNO: 

1955 1974s E Q| D f40; 
1964 1983seQ ID NO: 




3951 caataagatcaatagcaaa 



3970 agaaaaggcacaccttgac 

3971 ttccaatttccctgtggat 
3972acttcagagaaatacaaat 



3976igga 

3977 caaagaagtaaagattgat 
3978atgtgttaacaaaatattc 




3992 agagctgccagtccttcat 

3993 cctcctacagtggtggcaa 

3994 ctggatlccacatgcagct 



10903 
7385 
9017 
11364 11383 



7366 



9220 



3999tctgaattcattcaattgg 

4000 aaclaccctcactgccttt 

4001 g; 



4027 
9239 
10657 10676 
10732 10751 
7789 7808 
7871 7890 
11080 11099 
3688 3707 



5097 5116 



10317 10336 
10893 10912 



6008 6027 1 

5245 5264 1 

2898 2917 1 

5929 5948 1 

7224 7243 1 

1169 1188 1 

10024 10043 1 



7609 7628 

6508 6527 

7240 7259 

8506 8525 

6493 6512 

2140 2159 

5924 5943 

5490 5509 
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atttgaaatccaaagaagt 



2330 2349 SE q| DN o: 
2389 2408 SE QIDNO: 

2569 2588SEQIDNO: 

2570 2589 SE Q|DNO: 
2572 2591 S EQIDNO: 
2580 2599 SEQ | DNO : 
2611 2630 S EQIDNO: 
2687 2706 SE Q| DNO: 
2892 2911 SE Q] DNO : 
3173 3 I92 SEQ id N o: 
3373 3392 SE Q id NO: 
3395 3414 SE Q|DNO: 
3437 3456 SE q id NO: 
3626 3645 SEQlDNO: 
3664 3683 SEQ |DNO: 
3668 3687 SE Q id NO: 
4517 4536 SEQlD NO: 

4552 4571 SEQ |DNO: 

4553 4572 SE Q|DNO: 
5854 5873 SE Q lDNO ; 
5925 5944sEQ]DNO: 
5934 5953 SEQ | DNO: 
6017 6036SEQIDNO: 
6330 6349SEQIDNO: 
6421 6440 SE q| DNO: 
6673 6692 SEQ | DN 0: 
6721 6740SEQIDNO: 
6798 6817 SEQ | DNO: 
6927 6946SEQIDNO: 
6930 6949 SEQ | DNO: 
7062 7081 SEQ ID NO: 

7523 7542 SEQ1DN 0: 

7524 7543 SE Q| DNO : 
9315 9334 SEQlD NO: 
9342 9361 SEQ ID NO: 
9676 9695 SEQ |DNO: 
9861 9880 SE Q| DNO : 

10050 10069s E Q ID NO: 
1021810237 SE Q|DNO: 
1052910548 SE QIDNO: 
10530 10549 SE Q id NO: 
10839 10858 SEQ |D N0: 
1127311292SEQ ID NO: 
1182411843 SEQ |DNO: 
1269012709 SE Q|DNO: 
12898 12917seq \q NO: 
13208 13227 SE Q |D NO: 
2374 2393 SEQ | DN0 ; 

2408 2427 SE Q|DNO: 

2409 2428 SEQ | DNO : 



4007gt 
4008agaaggatggcattttttg 
4009ttcagagccaaagtccatg 




4017ttttttggaa 
401 8tcatgtgatgggtctctac 
401 9agtcaagaaggacttaagc 

4020 gacttcagagaaatacaaa 

4021 tgacitcagagaaatacaa 




4027 aaattaaaaagtcttgatg 
4028 taaaccaaaacttggttta 
4029ttcaaagacttaaaaaata 
4030 ataaagaaattaaagtcat 




4045cttttlcaccaacggagaa 

4046 ataaactgcaagatttttc 

4047 agaaaatcaggatctgagt 

4048 gattaccaccagcagttta 
4049cttcgtgaagaataffltg 

4050 aacadtacttgaattcca 

4051 ct 



5708 5727 

9500 9519 

9499 9518 

7922 7941 

14008 14027 

7127 7146 

3058 3077 

5147 5166 

12984 13003 

12325 12344 

5683 5702 



5312 5331 

11408 11427 
11407 11426 

8227 8246 

10607 10626 

7034 7053 

7792 7811 

12012 12031 

6740 6759 

9027 9046 

8015 8034 

7388 7407 

13390 13409 

11335 11354 

6015 6034 

11238 11257 

11237 11256 

12832 12851 

12531 12550 

10071 10090 

12577 12596 

11052 11071 

13063 13082 

11769 11788 

11768 11787 

10965 10984 

10846 10865 

13608 13627 

14035 14054 

13586 13605 

13268 13287 

10670 10689 

11410 11429 

11409 11428 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ !D NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



tgttttgaagactctccag 



gaagtccaaattccggatt 



atttacagctctgacaagt 
aggagcctaccaaaataat 



tgaagaagatggcaaattt 

aggatctgagttattttgc 

gtgcccttctcggttgctg 



ctggaaaatgtcagcctgg 



ctgctgattcaagaagtgc 



gcltcatcctgaagaccag 



tttgctgcagccatgtcca 
caagaggggcatcatttct 
tcactttaccgtcaagacg 



caaggagcaacacctcttc 



998 1017 SE q| DNO: 
1090 1109 SEQ | D no: 
1332 1351 SEQ ID NO: 
1876 1895 SEQ | D NO: 
2409 2428 SE Q|DNO: 
2416 2435 SEQIDNO: 
2438 2457 SEQ | DNO: 
2618 2637 SE Q|DNO: 
3305 3324 S EQIDNO' 
3504 3523 SEQ|DNO: 
3629 3648 SEQ | DNO: 
4605 4624 SE QIDNO: 
4745 4764 SEQ , DN0 ; 
5435 5454 SE Qi DNO : 
5602 5621 SEQ | DN |0: 
6409 6428 SEQ id mo. 
9426 9445 SE Q| DNO : 
9590 9609SEQID NO: 
10751 10770 SE Q|D NO: 
11992 12011 SEQ ID NO: 
14043 14062 SEQ id NO; 
26 45 S EQ ID NO: 
154 173 SE Q|DNO: 
162 181 SEQ ID NO: 
178 197 SE QIDNO: 
201 220 S EQ ID NO: 
212 231 SEQ | DNO: 

304 323 S EQ ID NO 

305 324 SEQ |D MO. 
313 332 SE q|dno: 
318 337 SE qidmO: 

334 353 SEQ id NO: 

335 354 SEQ | D N0; 
350 369 S EQ ID NO: 
352 371 SEQ ID NO: 
377 398 SEQ id NO 
382 401 SE Q id NO. 
384 403 SE QIDNO: 
387 406 SE q, DNO 
460 479 SE Q ID NO 
468 487 SEQ | DN0 . 
473 492 S EQ ID NO: 
482 501 seq id NO 
586 605 SE Q ID NO. 
682 701 s E Q ID NO: 
686 705 SE Q, DNO : 
861 880 SEQ id NO: 
875 894 SE Q ID NO 
901 920 SEQ |D NO 




4077cctggati 
4078ttlttcttcactacatctt 

4079 ccagacltccaoatcccag 

4080 agcalgcctagtttctcca 

4081 oagcatgcctagtttctcc 
4082tcttccatcacttgaccca 




4094 ctcaaccttaatgattttc 

4095 gcaagctatacagtaltot 
4096ctgcaggggatcccccaga 
4097tggaagtgtcagtggcaaa 
4098 agaataaatgacgttcttg 




4561 4580 2 



11384 11403 2 
10467 10486 2 
12664 12683 2 



9063 9082 
11838 11857 
10202 10221 



10928 10947 
5748 5767 



2611 
3942 
9972 
9971 
2069 



4985 5004 



8294 8313 

8385 8404 

2534 2553 

10380 10399 

7043 7062 

4368 4387 

7374 7393 

12506 12525 
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SEQ ID NO: 


2766 


acagaclttgaaacttgaa 


967 


986 S EQ ID NO 


41 04ttcaattcttcaatgctgt 


10508 


10527 1 


5 


SEQ ID NO: 


2767 


tgatgaagcagtcacatct 


1195 


1214 SE Q|DN0 


4105agatttgaggattccatca 


7984 


8003 1 


5 


SEQ ID NO: 


2768 


agcagtcacatetctctig 


1201 


1220 SEQ id NO 


4106caaggagaaactgactgct 


6532 


6551 1 


5 


SEQ ID NO: 


2769 


ccagccccatcactttaca 


1239 


1258 SEQ id no 


4107tgtagtotcotggtgotgg 


5102 


5121 


5 


SEQ ID NO: 


2770 




1288 


1307 S EQ ID NO 


4108ctggagcttagtaatggag 


8717 


8736 


5 


SEQ ID NO: 


2771 


catgccaacccoctlctga 


1322 


1341 SEQ ID NO 


41 091cagatgagggaacacatg 


8927 


8946 


5 


SEQ ID NO: 


2772 


gagagatcttcaacatggc 


1398 


1417 SE Q ID NO 


4110 gccaccctggaactctctc 


10877 


10896 


5 


SEQ ID NO: 


2773 


tcaacatggcgagggatca 


1407 


1426SEQ ID NO 


4111 tgatcccacctctcattga 


2973 


2992 


5 


SEQ ID NO: 


2774 


ccaoottgtatgcgclgag 


1437 


1456 SE Q|DNO 


41 12ctcagggatotgaagglgg 


8195 


8214 


5 


SEQ ID NO: 


2775 


gtcaacaactatcataaga 


1463 


1482 SEQ id NO 


41 1 3tcttgagttaaatgctgac 


4987 


5006 


5 


SEQ ID NO: 


2776 


tggacattgctaattacct 


1509 


1528 S EQ ID NO 


41 Uaggtatattcgaaagtcca 


12807 


12826 


5 


SEQ ID NO: 


2777 


ggacattgctaatlaoctg 


1510 


1529 S EQIDNO 


41 15caggtatattcgaaagtcc 


12806 


12825 


5 


SEQ ID NO: 


2778 


itctgcgggtcattggaaa 


1581 


1600 SEQ id N0 


41 1 6tttcacatgccaaggagaa 


6522 


6541 


5 


SEQ ID NO: 


2779 


coagaactcaagtcttoaa 


1628 


1647 S EQ ID NO 


41 1 7ttgaagtgtagtctcctgg 


5096 


5115 


5 


SEQ ID NO: 


2780 


agtcttcaatcctgaaatg 


1638 


1657 S EQ ID NO 


41 1 8catttctgattggtggact 


7765 


7784 


5 


SEQ ID MO: 


2781 


tgagcaagtgaagaacttt 


1876 


1895 SEQ id NO 


41 1 9aaagfgccacttttactca 


6191 


6210 


5 


SEQ ID NO: 


2782 


agcaagtgaagaactftgt 


1878 


1897 SEQ ID NO 


4120acaaagtcagtgccctgct 


6015 


6034 


5 


SEQ ID NO: 


2783 


tctgaaagaatctcaactt 


1972 


1991 SEQ ID NO 


4121 aagtccataatggftcaga 


12819 


12838 


5 


SEQ ID NO: 


2784 


actgtcatggacttcagaa 


1994 


2013 SE QID NO 


4122ttctgaatatattgtcagt 


13384 


13403 


5 


SEQ ID NO: 


2785 


acttgacccagcctcagcc 


2059 


2078 SEQ , D no 


4123 ggctcaccctg agagaagt 


12399 


12418 


5 


SEQ ID NO: 


2786 


tccaaataactaccttcct 


2104 


2123 SEQ id NO 


4124aggaagatatgaagatgga 


4720 


4739 


5 


SEQ ID NO: 


2787 


actaccotcactgootttg 


2141 


2160seq ID NO 


41 25caaatttgtggagggtagt 


10327 


10346 


5 


SEQ ID NO: 


2788 


ttggatttgcttcagctga 


2157 2176 SE Q|DNO 


41 26tcagtalaagtacaaccaa 


9400 


9419 


5 


SEQ ID NO: 


2789 


ttggaagctctttttggga 


2219 


2238 SE Q ID NO 


41 27tcccgattcacgcttccaa 


11585 


11604 


5 


SEQ ID NO: 


2790 


ggaagctctttttgggaag 


2221 


2240 S EQ ID NO 


4128 cttcagaaagctaccttcc 


7937 


7956 


5 


SEQ ID NO: 


2791 


tttttcccagacagtgtoa 


2246 


2265 SEQ id no 


41 29tgaccttctotaagcaaaa 


4884 


4903 


5 


SEQ ID NO: 


2792 


agacagtgtcaacaaagct 


2254 2273 SE Q|DNO 


41 30agcttggttttgccagtct 


2466 


2485 


5 


SEQ ID NO: 


2793 


ctttggctataccaaagat 


2329 


2348 SEQ id NO 


41 31 atctogtgtctaggaaaag 


5976 


5995 


5 


SEQ ID NO: 


2794 


caaagatgataaacatgag 


2341 


2360SEQ ID NO 


41 32 ctcaaggataacgtgtttg 


12617 


12636 


5 


SEQ ID NO: 


2795 


gatatggtaaatggaataa 


2363 


2382SEQ ID NO 


4133ttatcttattaattatatc 


13087 


13106 


5 


SEQ ID NO: 


2796 


ggaataatgctcagtgttg 


2375 2394seq \q no 


41 34 caacacttacttgaattcc 


10669 


10688 


5 


SEQ ID NO: 


2797 


tttgaaatccaaagaagtc 


2410 


2429 SEQ , D NO 


4135 gacttcagagaaatacaaa 


11408 


11427 


5 


SEQ ID NO: 


2798 


gatcccccagatgattgga 


2542 


2561 SEQ ID NO 


41 36tccaatttccctgtggatc 


3689 


3708 


5 


SEQ ID NO; 


2799 


cagatgattggagaggtca 


2549 


2568 SEQ | D NO 


41 37tgaccacacaaacagtctg 


5371 


5390 


5 


SEQ ID NO: 


2800 


agaatgacttttttcttca 


2583 


2602SEQ ID NO 


41 38tgaagtecggattcattct 


11023 


11042 


5 


SEQ ID NO: 


2801 


gaactccccactggagctg 


2627 2646 SEQ | DNO 


41 39 cagctcaaccgtacagttc 


11869 


11888 


5 


SEQ ID NO: 


2802 


atatcttcatctggagtca 


2660 


2B79 SEQ ID NO 


41 40tgacttcagtgcagaatat 


11974 


11993 


5 


SEQ ID NO: 


2803 


gtcatfgctcccggagcca 


2675 2694SEQ ID NO 


4141 tggccccgtttaccatgac 


5817 


5836 


5 


SEQ ID NO: 


2804 


gctgaagtttatcattcct 


2881 


2900 S EQ ID NO 


4142 aggaggctttaagttcagc 








SEQ ID NO: 


2805 




2894 


2913 S EQID NO 


4143 gtctcttcctccatggaat 


10478 


10497 




SEQ ID NO: 


2806 


ctcattgagaacaggcagt 


2984 


3003 S EQ ID NO 


41 44 actgactgcacgctttgag 


11764 


11783 


5 


SEQ ID NO: 


2807 


ttgagcagtattctgtcag 


3150 


3169SEQ1DNO 


4145ctgagagaagtgtcttcaa 


12407 


12426 


5 


SEQ ID NO: 


2808 


accttgtccagtgaagtcc 


3293 


3312SEQ ID NO 




12792 


12811 


5 


SEQ ID NO: 


2809 


coagtgaagtccaaattcc 


3300 3319seq ID NO 




9156 


9175 


5 


SEQ ID NO: 


2810 


acattcagaacaagaaaat 


3402 


3421 SEQ ID NO 


4148 atttcctaaagctggatgt 


11175 


11194 


5 


SEQ ID NO: 


2811 




3471 


3490SEQ ID NO 


41 49ataaactgcaagatttttc 


13608 


13627 


5 


SEQ ID NO: 


2812 


aaatcaagggtgttatttc 


3474 


3493 SE Q id NO 


41 50gaaacaatgcattagattt 


9753 


9772 


1 5 


SEQ ID NO: 


2813 




3617 


3636 SE Q ID NO 


: 4151 tctcccglgtataatgcca 


11789 


11808 


1 5 


SEQ ID NO: 


2814 


aagagaagattgaatttga 


3630 


3649SEQ ID NO 


: 4152tcaaaacctactgtctctt 


10466 


10485 


1 5 


SEQ ID NO: 


2815 


aaatgacttocaatltcco 


3681 


3700SEQ ID NO 


• 4153gggaactacaatttcattt 


7021 


7040 


1 5 
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SEQ ID NO: 


2813 


atgacttccaatttccctg 


3683 


3702 SEQ ID NO 


4154caggctgattacgagtoat 


4925 


4944 


5 


SEQ ID NO: 


2817 


acttccaatltccctgtgg 


3686 


3705 SEQIDNO 


4155ccacgaaaaatatggaagt 


10368 


10387 


5 


SEQ ID NO: 


2818 


agttgcaatgagctcatgg 


3811 


3830SEQ ID NO 


4156ccatcagtfcagataaact 


7997 


8016 


5 


SEQ ID NO: 


2819 


tttgcaagaccacctcaat 


3868 


3887 SEQ | DNO 


41 57 attgacctgtccattoaaa 


13679 


13698 


5 


SEQ ID NO: 


282D 


gaaggagttcaacctccag 


3892 


3911 SEQ ID NO 


4158 ctggaattgtcattccttc 


11736 


11755 


5 


SEQ ID NO: 


2821 




3927 


3 946s E qi DN o 


41 59ttttaacaaaagtggaagt 


6829 


6848 


5 


SEQ ID NO: 


2822 


ctcttcltaaaaagcgatg 


3947 


3966 SE Q ID NO 


41 60 catcactgccaaaggagag 


8494 


8513 


5 


SEQ ID NO: 


2823 


aaaagcgatggccgggtca 


3956 


3975 SEQ ID NO 


41611gactcactcattgatttt 


12688 


12707 


5 


SEQ ID NO: 


2824 


itcclttgcctttlggtgg 


4011 


4030 S EQ ID NO 


41 62 ccacaaacaatgaagggaa 


9264 


9283 


5 


SEQ ID NO: 


2825 


caagtctgtgggattccal 


4087 


4-106 SEQ | D N0 


41 63atgggaaaaaacaggcttg 


9574 


9593 


5 


SEQ ID NO: 


2826 


aagtccctacttttaccat 


4125 


4144 SE Q| DN0 


41 64atgggaagtataagaactt 


4842 


4861 


5 


SEQ ID NO: 


2827 


tgcctctcctgggtgftct 


4167 


4186 gEQ | D NQ 


41 65 agaaaaacaaacaoaggoa 


9651 


9670 


5 


SEQ ID NO: 


2828 


accagcacagaccatttca 


4250 


4269 SEQ | D N 0 


4166 tgaagtgtagtotcotggt 


5097 


5116 


5 


SEQ ID NO: 




coagcacagaccatttcag 


4251 


4270SEQ ID NO 


41 67ctgaaatacaatgctctgg 


5519 


5538 


5 


SEQ ID NO: 


2830 


actatcatgtgatgggtct 


4375 


4394 SEQ | D no 


4168 agacacctgattttatagt 


7956 


7975 


5 


SEQ ID NO; 


2831 


accacagatgtctgcttca 


4504 


4523seq |D NO 


41 69tgaaggctgactctgtggt 


4290 


4309 


5 


SEQ ID NO: 


2832 


ccacagatgtctgcttcag 


4505 


4524 seq ID NO 


4170 ctgagcaacaaatttgtgg 


10319 


10338 


5 


SEQ ID NO: 


2833 


tttggactccaaaaagaaa 


4528 


4547 SEQ ID NO 


41 71 tttctctcatgattacaaa 


5941 


5960 


5 


SEQ ID NO: 


2834 


tcaaagaagtcaagattga 


4560 


4579seq |D NO 


41 72tcaaggataacgtgtttga 


12618 


12637 


5 


SEQ ID NO: 


2835 


atgagaactacgagctgac 


4806 


4825SEQ ID NO 


41 73 gtoagataltgttgctcat 


10195 


10214 


5 


SEQ ID NO: 


2836 


ttaaaatctgacaccaatg 


4826 


4845 SEQ ID NO 


41 74 cattcattgaagatgttaa 


7350 


7369 


5 


SEQ ID NO: 


2837 


gaagtataagaactttgcc 


4846 


4865SEQIDNO 


41 75ggcaaatttgaaggacttc 


12002 


12021 


5 


SEQ ID NO: 


2838 


aagtataagaactttgcoa 


4847 


4866SEQ ID NO 


41 761ggcaaatttgaaggactt 


12001 


12020 


5 


SEQ ID NO: 


2839 


ttcttcagcctgctttctg 


4949 


4968 SE Q |D NO 


41 77cagaatccagatacaagaa 


6892 


6911 


5 


SEQ ID NO: 


2840 


ctggatcactaaattccca 


4965 


4984 SEQ |D NO 


41 78tgggtcttlccagagccag 


11041 


11060 


5 


SEQ ID NO: 


2841 


aaattaatagtggtgctca 


5022 


5041 SEQ ID NO 


41 79tgagaagccccaagaattt 


6256 


6275 


5 


SEQ ID NO: 


2842 


agtgcaacgaccaacttga 


5081 


5100 SEQIDNO 


4180tcaaattcotggatacact 


9856 


9875 


5 


SEQ ID NO: 


2843 


ctgggaagtgcttatcagg 


5246 


5265 SEQ ID NO 


41 81 cctgaccltcacataccag 


8318 


8337 


5 


SEQ ID NO: 


2844 


gcaaaaacattttaaactt 


5286 


5305seq |D NO 


41 82aagtaaaagaaaattttgc 


10752 


10771 


5 


SEQ ID NO: 


2845 


aaaaacattttcaacttca 


5288 


5307 SEQ |d no 


41 831gaagtaaaagaaaatttt 


10750 


10769 


5 


SEQ ID NO: 


2846 


tcagtcaagaaggacttaa 


5310 


5329 SEQ ID NO 


41 84ttaaggacttccattctga 


13371 


13390 


5 


SEQ ID NO: 


2847 


tcaaatgacatgatgggct 


5333 


5352 SEQ id NO 


41 85 agcccatcaatatcattga 


6213 


6232 


5 


SEQ ID NO: 


2848 


cacacaaacagtctgaaca 


5375 


5394seq ID NO 


4186tgtttcaactgoctttgtg 


11227 


11246 


5 


SEQ ID NO: 


2849 


tcttcaaaacttgacaaca 


5417 


5436SEQ ID NO 


4187tgtttlcclatttccaaga 


12843 


12862 


5 


SEQ ID NO: 


2850 


caagttttataagcaaact 


5449 


5468 SEQ | D N0 


4188agttaftttgctaaaottg 


14051 


14070 


5 


SEQ ID NO: 


2851 


tggtaactactttaaacag 


5496 


5515SEQ ID NO 


41 89ctgtttttagaggaaacca 


7520 


7539 


5 


SEQ ID NO: 


2852 


aacagtgacctgaaataca 


5510 


5529 SEQ ID NO 


41 90tgtatagcaaattcctgtt 


5898 


5917 


5 


SEQ ID NO: 


2853 


gggaaactacggctagaac 


5552 


5571 SEQ ID NO 


4191 gttccttccatgatttccc 


10941 


10960 


5 


SEQ ID NO: 


2854 


aacacatctatgccatctc 


5628 


5647 S EQ ID NO 


41 92 gagacagcatcttcgtgtt 


11212 


11231 


5 


SEQ ID NO: 


2855 


toagcaagctataaagcag 


5660 


5679 SE Q ID NO 


41 93 ctgctaagaaccttactga 


7788 


7807 


5 


SEQ ID NO: 


2856 




5675 


5694 SEQ ID NO 


41 94cctttcaagcactgactgc 


11754 


11773 


5 


SEQ ID NO: 


2857 


tctggggagaacatactgg 


5874 


5893 SEQ ID NO 


41 95ccaggttttccacaccaga 


8046 


8065 


5 


SEQ ID NO: 


2858 


ttctctcatgattacaaag 


5942 


5S61 SE Q |D NO 


41 96 ctttttcaooaacggagaa 


10846 


10865 


5 


SEQ ID NO: 


2859 


ctgagcagacaggcacctg 


6042 


6061 SEQ ID NO 


41 97 caggaggctttaagttcag 


7607 


7626 


5 


SEQ ID NO: 


2860 


caatttaacaacaatgaat 


6074 


6093 SE Q id mo 


41 98 attccttcctttacaattg 


8090 


8109 


5 


SEQ ID NO: 


2861 


tggacgaactctggctgac 


6148 


6167 S EQ ID NO 


41 99 gtcagcocagttccttcca 


10932 


10951 


5 


SEQ ID NO: 


2862 


ctlttactcagtgagccca 


6200 


6219seq ID NO 


4200tgggctaaacgtatgaaag 


7835 


7854 


5 


SEQ ID NO: 


2863 


tcattgatgctttagagat 


6225 


6244 SE Q |D NO 


4201 atcttoataagttcaatga 


13182 


13201 


5 


SEQ ID NO: 


2864 


aaaaccaagatgttcactc 


6303 


6322 S EQ ID NO 


4202gagtgaaatgctgtttttt 


8638 


8657 


5 
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SEQ ID NO: 


2865 


aqqaatcaacaaaccatta 


6365 6384seq id NO: 


4203taatgattttcaagttccl 


8302 


8321 1 


5 


SEQ ID NO: 


2866 


tagttgtaolggaaaacgt 


6384 6403seq ID NO: 


4204 acgttagoctctaagacta 


11936 


11955 1 


5 


SEQ ID NO: 


2867 


ggaaaacgtacagagaaag 


6394 6413seq \q NO: 


4205 cttttacaattcattttcc 


13022 


13041 1 


5 


SEQ ID NO: 


2868 


gaaaacgtacagagaaagc 


6395 6414ocninMn- 


4206 gctttctcttccacatttc 


10060 


10079 1 


5 


SEQ ID NO: 






6409 6428seq ID NO' 


4207 attgatgttagagtgcttt 


6992 


7011 1 


5 


SEQ ID NO: 


2870 


aagctgaagcacatcaata 


6410 6429SEQ id f^O: 








5 


SEQ ID NO: 


2871 




6414 6433«cn in Mrv 


4209tcaacctlaatgaftltca 


8295 


8314 1 


5 


SEQ ID NO: 


2872 




6422 6441 opn in Mrv 




1668 


1687 1 


5 


SEQ ID NO: 


2873 


tt t 


6484 6503g£Q \q f]Q : 


421 1 tgaaatcattgaaaaatta 


6727 


6746 


5 


SEQ ID NO: 


2874 




6488 6507g£Q \q NO: 


421 2tgaagtagctgagaaaatc 


7102 


7121 


5 


SEQ ID NO: 


2875 


aattgggagagacaagttt 






9496 


9515 


5 


SEQ ID NO: 


2876 




6701 6720SEQIDNO : 


421 4tattgaaaatattgatttt 


6814 


6833 


5 


SEQ ID NO: 


2877 


aaaattaaaaagtcttgat 


6739 675S S EQIDNO: 


4215atcatatccgtgtaatttt 


3765 


6784 


5 


SEQ ID NO: 


2878 




6816 6835 S EQIDNO: 


4216ttaatcttcataagttcaa 


13179 


13198 


5 


SEQ ID NO: 


28/9 




6946 6965seQ ID NO: 


421 7agcttggttttgccagtct 


2466 


2485 


5 


SEQ ID NO: 
SEQ ID NO: 


2880 


caaWcaffigaaa^aTt 0 ' 


7029 7048cc*n in Mrv 
,u " ,u ^ 0 bcU ID NU. 


421 8 attccttcctttacaattg 


8090 


8109 


5 


2881 


aggttttaatggataaatt 


7I82 ^201sEQIDNO: 




13155 


13174 


5 


SEQ ID NO: 


2882 


cagaagctaagcaatgtcc 


7241 7260gEQ ID NO: 


4220ggacaaggcccagaatclg 


12553 


12572 


5 


SEQ ID NO: 


2883 




7270 7289cc/~i in Mn- 


4221 aaagaaaacctatgcctta 


13163 


13182 


5 


SEQ ID NO: 


2884 


aaa ^ttacttt a aaat 
aaaga ac S^gsaa 


7277 7296gEQ ID NO: 


4222atttcttaaacattccttt 


9489 


9508 


5 


SEQ ID NO: 


2885 






4223taaagccattcagtctctc 


12970 


12989 


5 


SEQ ID NO: 


2886 


affia^atgatgctgtc 3 


7303 7322gEQ |Q NO" 


4224gacatgttgataaagaaat 


7379 


7398 


5 


SEQ ID NO: 


2887 


gaattatcttttaaaacat 


7334 7353gEQ |q MO: 
74*1-1 7430onn m M**v 


4225atgtatcaaatggacattc 


7685 
10739 


7704 
10758 


5 
5 


SEQ ID NO: 
SEQ ID NO: 


2888 
2889 


t t 


-til ''▼J'JbLU IU NU. 
7548 7567ggQ \q fvJO: 


^^ctWca^ttaga^caa 3 


8420 


8439 


5 


SEQ ID NO: 


2890 


gcag g a c ^aaag 


7699 7718gEQ ID NO' 


4228ttgaaggacttcaggaatg 


12009 


12028 


5 


SEQ ID NO: 


2891 


acacc ga a ag cc 


7958 7977gEQ ID NO: 


4229ggactcaaggataacgtgt 


12614 


12633 


5 


SEQ ID NO: 


2892 




7992 801 1 cm in ma- 


4231Watgattat^ 


13124 


13143 


5 


SEQ ID NO: 


2893 


ttgtegTaatgaaagtaaa 


8112 8131gEQIDNO" 




12360 


12379 


5 


SEQ ID NO: 


2894 


ctgaacagtgagctgcagt 


8156 8175g[£Q ID NO' 


4232actggacttctctagtcag 


8809 


8828 


5 


SEQ ID NO: 


2895 




8407 8426gEQ ]D NO' 


4233gaaaaatgaagtccggatt 


11017 


11036 


5 


SEQ ID NO: 


2896 


attttgattttcaagcaaa 


8532 8551 SEQ ID NO" 


4234tttgcaagttaaagaaaat 


14023 


14042 


5 


SEQ ID NO: 


2897 


ttttgattttcaagcaaat 


8533 8552gEQ ID NO' 


4235atttgatttaagtgtaaaa 


9622 


9641 


5 


SEQ ID NO: 


2898 




8536 8555gEQ ID NO' 


4236tgcaagttaaagaaaatca 


14025 


14044 


5 


SEQ ID NO: 


2899 




8645 8664sEQ ID NO* 


4237cattggtaggagacagcat 


11203 


11222 


5 


SEQ ID NO: 


2900 


gg g 


8646 8665gEQ ID NO' 


4238gcattggtaggagacagca 


11202 


11221 


1 5 


SEQ ID NO: 


2901 




8706 8725sEQ ID NO' 


4239agctagagggcctcttttl 


10833 


10852 


1 5 


SEQ ID NO: 




actggagcttagtaatgga 


8716 8735gEQ |D NO' 


4240tccactcacatcotccagt 


1269 


1308 


1 5 


SEQ ID NO: 


2903 


cttctggaaaagggtcatg 


8886 8905 SE Q|DNO: 


4241 catgaacccctacatgaag 


13759 


13778 


1 5 


SEQ ID NO: 


2904 


ggaaaagggtcatggaaat 


8891 8910 S EQ|DNO: 


4242atttgaaagttcgltttcc 


9282 


9301 


1 5 


SEQ ID NO: 


2905 


gggcctgccccagattotc 


8910 8929 SE qidN0: 


4243gagaacattatggaggccc 


9440 


9459 


1 5 


SEQ ID NO: 


2906 


ttctcagatgagggaacac 


8924 8943 S EQ|DNO: 


4244 gtgtcttcaaagctgag aa 


12416 


12435 


1 5 


SEQ ID NO: 


2907 


gatgagggaacacatgaat 


8930 8949SEQIDNO: 


4245 attccagcttccccacatc 


8338 


8357 


1 5 


SEQ ID NO: 


2908 


ctttggactgtccaataag 


8986 9005 SEQ id N0; 


4246cttatgggatttcctaaag 


11167 


11186 


1 5 


SEQ ID NO: 


2909 


gcatccacaaacaatgaag 


9260 9279 SE Q |D NO: 


4247 cttcatctgtcattgatgc 


10227 


10246 


1 5 


SEQ ID NO: 


2910 


cacaaacaalgaagggaat 


9265 9284 SEQ iDNO: 


4248attccctgaagttgatgtg 


11488 


11507 


1 5 


SEQ ID NO: 


2911 


ccaaaatttctctgctgga 


9415 9434g E QiDNO: 


4249tccatcacaaatcctttgg 


9671 


9690 


1 5 


SEQ ID NO: 


2912 




9416 9435 SE Q|DNO: 


4250ttccatcacaaatcctttg 


9670 


9689 


1 5 


SEQ ID NO: 


2913 


tctgctggaaacaacgaga 


9425 9444 SE QlDNO: 


4251 tctcaagagttacagcaga 


13229 


13248 


1 5 
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SEQ ID NO: 


2914 


ctgctggaaacaacgagaa 


9426 


^SsEQ ID NO 


4252ttctcaagagttacagcag 


13228 


13247 


1 


5 


SEQ ID NO: 


2915 


agaacattatggaggccca 


9441 


9460SEQ ID NO 


4253tgggoctgccccagattct 


8909 


8928 


1 


5 


SEQ ID NO: 


291 S 


agaagcaaatctggattto 


9475 


8494 SEQ ID NO 


4254gaaatcttcaatltattot 


13821 


13840 


1 


5 


SEQ ID NO: 


2917 


tttctetctatgggaaaaa 


9565 


9584 SEQ |D mo 


4255tttttgcaagtlaaagaaa 


14021 


14040 


1 


5 


SEQ ID NO: 


2918 


tcagagcatcaaatccttt 


9712 9731 SEQ ID NO 


4256 aaagaaaatcaggatctga 


14033 


14052 


1 


5 


SEQ ID NO: 


2919 


cagaaacaatgcattagat 


9751 


9770 SEQ ID NO 


4257atctatgccatctcttctg 


5633 


5652 


1 


5 


SEQ ID NO: 


2920 


tacacattaatcctgccat 


10001 10020 S EQ ID NO 


4258atggagtctttattgtgta 


14089 


14108 


1 


5 


SEQ ID NO: 


2921 


agtcagatattgttgctea 


10194 10213 S EQ ID NO 


4259tgagaactacgagctgaot 


4807 


4826 


1 


5 


SEQ ID NO: 


2922 


ggagggtagtcataacagt 


1033610355seqidNO 


4260 actggtggcaaaaccctcc 


2734 


2753 


1 


5 


SEQ ID NO: 


2923 


caaaagccgaaattccaat 


1040410423 SE Q|DNO 


4261attgaagtacctacttttg 


8366 


8385 


1 


5 


SEQ ID NO: 


2924 


aaaagccgaaattccaatt 


1040510424SEQIDNO 


4262aattgaagtacotactttt 


8365 


8384 


1 


5 


SEQ ID NO: 


2925 


ttcaagcaagaacttaatg 


1043610455 SEQ | DN |o 


4263catlatggcccttcgtgaa 


13258 


13277 


1 


5 


SEQ ID NO: 


2925 


cctcttacttttccattga 


10578 


0597 s E Q id NO 


4264tcaaaagaagcccaagagg 


12947 


12966 


1 


5 


SEQ ID NO: 


2927 


tgaggccaacacttacttg 


10663 10682 SE Q|D NO 


4265caagcatctgattgactca 


12676 


12695 


1 


5 


SEQ ID NO: 


292B 


cacttacttgaattccaag 


10672 


0S91 SEQ ID NO 


4266cttgaacacaaagtcagtg 


6008 


6027 


1 


5 


SEQ ID NO: 


2929 


gaagtaaaagaaaattttg 


10751 


07 70 S EQIDNO 


4267caaaaacattttcaacttc 


5287 


5306 


1 


5 


SEQ ID NO: 


2930 


cctggaactctctccatgg 


10882 10901 SEQ id N0 


4268 ccattlacagatcttcagg 


11372 


11391 


1 


5 


SEQ ID NO: 


2931 


agctggatgtaaccaccag 


1118411203SEQIDNO 


4269ctggattccacatgcagct 


11855 


11874 


1 


5 


SEQ ID NO: 


2932 


aaaattccotgaagtlgat 


1148511504SEQIDNO 


4270atcataiccgtgtaatttt 


6765 


6784 


1 


5 


SEQ ID NO: 


2933 


cagatggcattgctgcttt 


11613 


1632 SEQ | D N0 


4271 aaagctgagaagaaatctg 


12424 


12443 


1 


5 


SEQ ID NO: 


2934 


agatggoattgctgctttg 


1161411633SEQIDNO 


4272 caaagctgagaagaaatct 


12423 


12442 


1 


5 


SEQ ID NO: 


2935 


tgttgaaacagtcctggat 


11 842 11 861 seq to NO 


4273 atccaagatgagatcaaca 


13103 


13122 


1 


5 


SEQ ID NO: 


2938 


catattcaaaactgagttg 


12229 12248 SE Q|D NO 


4274caactctctgattactatg 


13631 


13650 


1 


5 


SEQ ID NO: 


2937 


aaagatttatcaaaagaag 


12938 12957 SEQlDNO 


4275cttcaatttattcttcttt 


13826 


13845 


1 


5 


SEQ ID NO: 


2938 


attttccaactaatagaag 


13034 


3053 SEQ | DN O 


4276 cttcaaagacttaaaaaat 


8014 


8033 


1 


5 


SEQ ID NO: 


2939 


aaltatalccaagatgaga 


13097 131 16 SEQ |D NO 


4277tctcttcctccatggaatt 










SEQ ID NO: 


2940 


ttcaggaagcttctcaaga 


13218 


3237 SEQ id NO 


4278 tcttcataagttcaatgaa 


13183 


13202 


1 


5 


SEQ ID NO: 


2941 


ttgagcaatttctgcacag 


1343713456 3EQ |DNO 


4279ctgllgaaagatttatoaa 


12932 


12951 


1 


5 


SEQ ID NO: 


2942 


cfgatatacatcacggagt 


13712 


3731 SEQ ID NO 


4280 actcaatggtgaaattcag 


7465 


7484 


1 


5 


SEQ ID NO: 


2943 


acatcacggagttactgaa 


13719 13738 seq ID NO 


428 1 ttcagaagctaagcaatgt 


7239 


7258 


1 


5 


SEQ ID NO: 


2944 


actgcctatattgataaaa 


13882 13901 seq id NO 


4282ttttggcaagctatacagt 


8380 


8399 


1 


5 


SEQ ID NO: 


2945 


aggatggcattttttgcaa 


14011 


4030 3E Q ID NO 


4283ttgcaagcaagtctltcct 


3013 


3032 


1 


5 


SEQ ID NO: 


2946 


ttttttgcaagttaaagaa 


14020 


4039 SE Q|DNO 




9566 


9585 


1 


5 


SEQ ID NO: 


2947 


tccagaactcaagtcttca 


1627 


1646 SEQ | D NO 


4285tgaaatgctgttttttgga 


8641 


8660 


3 


4 


SEQ ID NO: 


2948 


agttagtgaaagaagttct 


1956 


1975 SE qidN0 


4286 agaatctgtaccaggaact 


12564 


12583 


3 


4 


SEQ ID NO: 


2949 


atttacagctctgacaagt 


5435 
6488 


5454 seq id NO 


4287 acttcagagaaatacaaat 


11409 
7429 


11428 
7448 


3 
3 


4 
4 


SEQ ID NO: 
SEQ ID NO: 


2950 
2951 


gtgcccttctcggttgctg 


26 


6507 SEQ ID NO 
45sEQ ID NO 


4288lgaaaccaatgacaaaatc 
4289 cagctgagcagacaggcac 


6039 


6058 


2 


4 


SEQ ID NO: 


2952 


attcaagcacctccggaag 


253 


272 S EQ ID NO 


4290cttcataagttcaatgaat 


13184 


13203 


2 


4 


SEQ ID NO: 


2953 


gactgctgattcaagaagt 


316 


335 SE Q ID NO 


4291 aottcccaactctcaagtc 


13415 


13434 


2 


4 


SEQ ID NO: 


2954 


ttgctgcagccatgtccag 


483 


502 SE Q ID NO 


4292 ctgggcagctgtatagcaa 


5889 


5908 


2 


4 


SEQ ID NO: 


2955 




555 


574 SEQ ID NO 


4293taagtatgatttcaattct 


10498 


10517 


2 


4 


SEQ ID NO: 


2956 


tgaagactctccaggaact 


1095 


1114SEQIDNO 


4294 agltcaatgaatttattca 


13191 


13210 


2 


4 


SEQ ID NO: 


2957 


atctctcttgccacagctg 


1210 


1229 S EQIDNO 


4295 cagcccagccatttgagat 


9237 


9256 


2 


4 


SEQ ID NO: 


2958 


tctctcttgccacagctga 


1211 


1230 SE Q|DNO 


4296tcagcccagccatttgaga 


9236 


9255 


2 


4 


SEQ ID NO: 


2959 


tgaggtgtccagccccatc 


1231 


1250 3 EQ ID NO 


4297gatgggaaagccgccctca 


5216 


5235 


2 


4 


SEQ ID NO: 


2960 


ccagaactcaagtcttcaa 


1628 


16 47SEQIDN0 


4298ttgaaagcagaacctctgg 


5915 


5934 


2 


4 


SEQ ID NO: 


2961 


ctgaaaaagttagtgaaag 


1949 


1968 S EQ ID NO 


4299 ctttctcgggaatattcag 


10631 


10650 


2 


4 


SEQ ID NO: 


2962 


tttttcccagacagtgtca 


2246 


2265 S EQ ID NO 




9730 


9749 


2 


4 


SEQ ID NO: 


2963 




2247 


2266 SE Q |D NO 


4301 ttgacaggcattttgaaaa 


9729 


9748 


2 


4 
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SEQ ID NO: 


2964 


cattcagaacaagaaaatt 


3403 


3422 S EQIDNO 


4302aatlccaatlttgagaatg 


10414 


10433 


2 


4 


SEQ ID NO: 


2965 


tgaagagaagattgaattt 


3628 


3647 SE Q |D NO 


4303 aaatgtcagctcttgttca 


10902 


10921 


2 


4 


SEQ ID NO: 


2966 


tttgaatggaacacaggca 


3644 


3663 SEQ | D no 


4304tgccagtttgaaaaacaaa 


11815 


11834 


2 


4 


SEQ ID NO: 


2967 


ttctagattcgaatatcaa 


4407 4426 S EQIDNO 


4305ttgacatgttgataaagaa 


7377 


7396 


2 


4 


SEQ ID NO: 


2968 


gattcgaatatcaaattca 


4412 


4431 SEQ ID NO 


4306tgaagtagaccaacaaalc 


7162 


7181 


2 


4 


SEQ ID NO: 


2969 


tgcaacgaccaacttgaag 


5083 


5102SEQ ID NO 


4307 cttcaggttccatcgtgca 


11384 


11403 


2 


4 


SEQ ID NO: 


2970 


ttaagctctcaaatgacat 


5325 


5344 SE Q |o no 


4308 atgttgataaagaaattaa 


7382 


7401 


2 


4 


SEQ ID NO: 


2971 


caatttaacaacaatgaat 


6074 


6093 SEQ id mo 


4309attcaaactgcciatattg 










SEQ ID NO: 




tgaatacagccaggacttg 


6088 


6107 SE Q) D rN]o 


4310caagagcacacggtcttca 


10687 


10706 


2 


4 


SEQ ID NO: 


2973 


catcaatattgatcaattt 


6421 


6440 SE QlDNO 


431 1 aaattccctgaagttgatg 


11486 


11505 


2 


4 


SEQ ID NO: 


2974 


ttgagcatgtoaaacactt 


7059 7078 SEQ |DNO 


431 2 aagtaagtgctaggttcaa 


9381 


9400 


2 


4 


SEQ ID NO: 


2975 




7227 7246SEQ1DNO 


431 3ttctgcaoagaaatattoa 


13446 


13465 


2 


4 


SEQ ID NO: 


2976 


ttcaggctcttcagaaago 


7929 7948 SE q 1DNO 


43 1 4 gcttgcfaacctctctgaa 


12312 


12331 


2 


4 


SEQ ID NO: 


2977 


fccacaaattgaacatccc 


8787 


8805SEQ ID NO 


431 5gggaco1acoaagagtgga 


12533 


12552 


2 


4 


SEQ ID NO: 


2978 


tgaataccaatgctgaact 


1016710186SEQIDNO 


431 6agtteaatgaatttattca 


13191 


13210 


2 


4 


SEQ ID NO: 


2979 


taaactaatagatgtaatc 


1289812917SEQIDNO 


4317gattactatgaaaaattta 


13640 


13659 


2 


4 


SEQ ID NO: 


2980 


ttgacclgtccattcaaaa 


13680 


3699SEQ ID NO 


4318ttttaaaagaaatcttcaa 


13813 


13832 


2 


4 


SEQ ID NO: 


2981 


gggctgagtgcccttctcg 


19 


38 SEQ ID NO 


431 9cgaggccaggccgcagccc 


84 


103 


1 


4 






ggctgagtgcccttctcgg 


20 


39 SEQ ID NO 


4320ccgaggccaggccgcagcc 


83 


102 


1 


4 


SEQ ID NO: 


2982 












SEQ ID NO: 


2983 


ctgagtgcccttctcggtt 


22 


41 SEQ ID NO 




11557 


11576 




4 


SEQ ID NO: 


2984 


tctcggttgctgccgctga 


33 
90 


52SEQ ID NO 
109 SEQ ID NO 


4322tcagctgacotcatcgaga 


2168 
376 


2187 
395 


1 
1 


4 

4 


SEQ ID NO: 


2985 


caggccgcagcccaggagc 




4323 gctctgcagcttcatcctg 




SEQ ID NO: 


2986 


gctggcgctgcctgcgctg 


151 


170 SEQ ID NO 


4324 cagcacagaccatttcagc 


4252 


4271 


1 


4 


SEQ ID NO: 


2987 


tgctgctggcgggcgccag 


177 


196SEQ ID NO 


4325ctggatgtaaccaccagca 


11186 


11205 


1 


4 


SEQ ID NO: 




ctggtctgtccaaaagatg 


227 


246 SEQ ID NO 


4326 catccigaagaccagccag 


388 


407 


1 


4 


SEQ ID NO: 


2989 


ctgagagttccagtggagt 


291 


310 S EQ ID NO 


4327actcaccctggacattaag 


3391 


3410 


1 


4 


SEQ ID NO: 


2990 


tccagtggagtccctggga 


299 


318 SEQ |D no 


4328tcccggagccaaggctgga 


2683 


2702 


1 


4 


SEQ ID NO: 


2991 


aggttgagctggaggttac 


354 


373 SEQ ID NO 


4329ggaaccctctccctcacct 


4736 


4755 


1 


4 


SEQ ID NO: 


2992 


tgagctggaggfccccag 


358 


37 ?SEQIDNO 


4330 ctgggaggcatgatgctca 


9171 


9190 


1 


4 


SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 


2993 


tctgcagcttcatcctgaa 


378 
402 
472 


397SEQ1DNO 
421 SEQ ID NO 
491 SEQ ID NO 


4331 ttcaaatataatcggcaga 

4332tcttccgttctgtaatggc 

4333tgcaagaatattttgagag 


III 


3288 
5821 
6367 


1 
1 


4 
4 


2994 
2995 


gccagtgcaccctgaaaga 
ctctgaggagtttgctgca 


1 


4 


SEQ ID NO: 


2996 


aggtatgagctcaagctgg 


500 


51 9 SEQIDNO 


4334ccag(tfccggggaaacct 


12724 


12743 


1 


4 


SEQ ID NO: 


2997 


tcctttacccggagaaaga 


543 


562 SEQ |p mo 


4335tctttttgggaagcaagga 


2227 


2246 


1 


4 


SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 


2998 


catcaagaggggcatcatt 
tcctggltcccccagagac 


583 
609 


6 ° 2 SEQIDNO 
628 SEQ ID NO 
649 SE Q id NO 


4336 aatggtcaagltcctgatg 

4337 gtctctgaactcagaagga 


2285 
13996 


2304 
14015 
14099 


1 

1 


4 
4 


2999 
3000 


aagaagccaagcaagtgtt 


630 


4338 aacaaataaatggagtctt 


14080 


1 


4 


SEQ ID NO: 


3001 


aagcaagtgttgtttctgg 


638 


657 SEQ ID NO 


4339coagagccaggtcgagctt 


11050 


11069 


1 


4 


SEQ ID NO: 


3002 


tctggataccgtgtatgga 


652 


671SEQIDNO 


4340 tccatgtcccatttacaga 


11364 








SEQ ID NO: 


3003 


ccactcactttaccgtcaa 


678 


697s E Q ID NO 


4341 ttgatttlaacaaaagtgg 


6825 


6844 


1 




SEQ ID NO: 


3004 


aggaagggcaatgtggcaa 


701 


720SEQ ID NO 


4342ttgcaagcaagtctttcct 


3013 


3032 






SEQ ID NO: 


3005 


gcaatgtggcaacagaaat 


708 


727SEQ ID NO 


4343 atttccataccccgtttgc 


3488 


3507 


1 


4 


SEQ ID NO: 


3006 


caatgtggcaacagaaata 


709 


728 SEQ ID NO 


4344tattcttcttttccaaUg 


13834 


13853 




4 


SEQ ID NO: 


3007 


tggcaacagaaatatccac 


714 


733SEQ ID NO 


4345gtggcttcccatattgcca 


1895 


1914 


1 


4 


SEQ ID NO: 


3008 


agagacctgggccagtgtg 


737 


756SEQ ID NO 


4346cacattacatttggtctct 


2938 


2957 


1 




SEQ ID NO; 


3009 


tgtgatcgcttcaagccca 


752 


771 SEQ ID NO 


4349tgggaaagccgccctcaca 


5218 


5237 




4 


SEQ ID NO: 


3010 




753 


772s E Q iD NO 


4350atgggaaagccgccctcac 


5217 


5236 


1 


4 


SEQ ID NO: 


3011 


cagcccacttgctctcatc 


784 


803 SE Q ID MO 


4351 gatgctgaacagtgagctg 


8152 


8171 


1 


4 


SEQ ID NO: 


3012 




794 


813SEQ ID NO 


4352 tcataacagtactgtgagc 


-1 0345 


10364 


1 


4 
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SEQ ID NO: 3013 ccttgtcaactctgataag 

SEQ ID NO: 3014 cttgtcaactctgatcagc 

SEQ ID NO: 3015 agccatctgcaaggagcaa 

SEQ ID NO: 3016 gccatctgcaaggagcaac 

SEQ ID NO: 3017 cttcctgcctttctcctac 

SEQ ID NO: 3018 ctttctcctacaagaataa 

SEQ ID NO: 3019 gatcaacagccgcttcttt 

SEQ ID NO: 3020 atcaacagccgcttetttg 

SEQ ID NO: 3021 acagccgcttctttggtga 

SEQ ID NO: 3022 aagatgggcctcgcatttg 

SEQ ID NO: 3023 tgttttgaagactotccag 

SEQ ID NO: 3024 ttgaagactctccaggaac 

SEQ ID NO: 3025 aactgaaaaaactaaccat 

SEQ ID NO: 3026 ctgaaaaaactaaccatat 

SEQ ID NO: 3027 aaaactaaccatotctgag 

SEQ ID NO: 3028 tgagcaaaatatccagaga 

SEQ ID NO: 3029 caataagctggttactgag 

SEQ ID NO: 3030 tactgagctgagaggcctc 

SEQ ID NO: 3031 gcctcagtgatgaagcagt 

SEQ ID NO: 3032 agtcacatctctcttgcca 

SEQ ID NO: 3033 atctctcttgccacagctg 

SEQ ID NO: 3034 tctctcttgccacagctga 

SEQ ID NO: 3035 tgccacagctgattgaggt 

SEQ ID NO: 3036 gccacagctgattgaggtg 

SEQ ID NO: 3037 tcactttacaagccttggt 



SEQ ID NO: 


3038 


cccttctgatagatgtggt 


SEQ ID NO: 


303S 


gtcacctacctggtggccc 


SEQ ID NO: 


3040 


ccttgtatgcgctgagcca 


SEQ ID NO: 


3041 


gacaaaccctacagggacc 


SEQ ID NO: 


3042 


tgctaattacctgatggaa 


SEQ ID NO: 


3043 


tgactgcactggggatgaa 


SEQ ID NO: 


3044 


actgcactggggatgaaga 


SEQ ID NO: 


3045 


atgaagattacacctattt 


SEQ ID NO: 


3046 


accatggagcagttaactc 


SEQ ID NO: 


3047 


gcagttaactccagaactc 


SEQ ID NO: 


3048 


cagaactcaagtcttcaat 


SEQ ID NO: 


3049 


caggctctgcggaaaatgg 


SEQ ID NO: 


3050 


ccaggaggttottcttcag 


SEQ ID NO: 


3051 


ggttcttcttcagacfflc 


SEQ ID NO: 


3052 


tttccttgatgatgcltct 


SEQ ID NO: 


3053 


ggagataagcgactggctg 


SEQ ID NO: 


3054 


gctgcctatcttatgttga 


SEQ ID NO: 


3055 


actttgtggcttcccatat 


SEQ ID NO: 


3056 


gccaatatcttgaactcag 


SEQ ID NO: 


3057 


aatatcttgaactcagaag 


SEQ ID NO: 


305B 


ctcagaagaattggatatc 


SEQ ID NO: 


3059 


aagaattggatatccaaga 


SEQ ID NO: 


3060 


agaattggatatccaagat 


SEQ ID NO: 


3061 


tggatatccaagatctgaa 


SEQ ID NO: 


3062 


atatccaagatctgaaaaa 



819 838 S EQIDNO: 

820 839SEQIDNO: 

892 911 S EQ ID NO: 

893 912 S EQ ID NO: 
916 935 SE Q| D NO: 
924 943 SE Q ID NO: 

997 1016seq|dNO: 

998 101?sEQIDNO: 
1002 1021sEQIDNO: 
1031 1050 SE Q|DNO: 
1090 1109 SE Q|DNO: 
1094 1113SEQIDNO: 
1110 1129 SE Q| D NO: 
1112 1131 SEQ ID NO: 
1117 1136 SE Q|DNO: 
1132 1151 SEQ ID NO: 
1162 1181 S EQ ID NO; 
1174 1193 S EQIDNO: 
1188 1207 3E Q|DNO: 

1204 1223SEQIDNO: 

1210 1229 SE Q|DNO: 

1211 1230 SE Q|DNO: 

1218 1237 SE QIDNO: 

1219 1238 S EQIDNO: 
1248 1267SEQIDNO: 
1332 1351 SEQ ID NO: 
1349 1368 5EQ | DN 0: 
1440 1459SEQ ID NO: 
1480 1499SEQIDNO: 
1516 1535SEQIDNO: 
1546 1565 S EQIDNO: 
1548 1567SEQIDNO: 
1560 1579SEQIDNO: 
1610 1629SEQIDNO: 
1618 1637 SE Q|DNO: 
1629 1648 SEQ ID NO: 
1703 1722 S EQ!DNO: 
1738 1757 S EQIDNO: 
1744 1763 S EQIDNO: 
1759 1778SEQIDNO: 
1781 1800 SE qidNO: 
1796 1815SEQIDNO: 
1890 1909SEQIDNO: 
1910 1929SEQIDNO: 
1913 1932 S EQIDNO: 
1924 1943SEQIDNO: 

1929 1948 SEQ |DNO: 

1930 1949SEQIDNO: 
1935 1954SEQIDNO: 
1938 1957 SE Q|DNO: 



4353otgagtgggtttatcaagg 

4354gotgagtgggtttatcaag 

4355ttgcaatgagctcatggct 

4356gttgcaatgagctoatggc 

4357gtaggaataaatggagaag 

4358ttattgctgaatccaaaag 

4359aaagccatcactgatgatc 

4360caaagccatcactgatgat 

4361 tcacaaatcctttggctgt 

4362caaaatagaagggaatctt 

4363ctggtaactacffiaaaca 

4364gttcaatgaatttattcaa 

4365atggcattttitgcaagtt 

4366agattgatgggcagttcag 

4367ctcaaagaatgactttttt 

4368tctccagataaaaaactca 

4369ctcagatcaaagttaattg 

4370 gagggtagtcataacagta 

4371 actgttgactcaggaaggc 
4372tggccacatagcatggact 

4374tcagctgacctcatcgaga 
4375acctgcaccaaagctggca 
4376caccaaaaaccccaatggc 

4377 accagatgctgaacagtga 

4378 aocacttacagotagaggg 

4379 gggcgacctaagttgtgac 
4380tggctggtaacctaaaagg 
4381ggtcctttatgattatgtc 
4382ttoccaaaagcagtcagca 
4383ttcaggtccatgcaagtca 
4384tcttgaacacaaagtcagt 
4385aaatgaaagtaaagateat 
4386gagtaaaccaaaacttggt 
4387gagttactgaaaaagctgc 
4388attggatatccaagatctg 
4389ccatgacctccagctcctg 
4390ctgaaatacaatgctctgg 
4391 gaaaaacttggaaacaacc 
4392agaatccagatacaagaaa 
4393 cagcatgcctagtttctcc 
4394tcaatatcaaaagcccagc 
4395atatctggaacottgaagt 
4396ctgaactcagaaggatggc 

4397 cttccattctgaatatatt 

4398 gataaaagattactttgag 
4399tcttcaatttattcttctt 
4400atcttcaatttattcttct 
4401 ttcacataccagaattcca 
4402tttttaaccagtcagatat 



12453 12472 1 4 

12452 12471 1 4 

3813 3832 1 4 

3812 3831 1 4 

9461 9480 1 4 

13656 13675 1 4 

1669 1688 1 4 

1668 1687 1 4 

9675 9694 1 4 

2077 2096 1 4 

5495 5514 1 4 

13192 13211 1 4 

14014 14033 1 4 

4572 4591 1 4 

2578 2597 1 4 

12209 12228 1 4 

12273 12292 I 4 

10337 10356 1 4 

12580 12599 1 4 

8866 8885 1 4 

2169 2188 1 4 

2168 2187 1 4 

13963 13982 1 4 

11248 11267 1 4 

8148 8167 1 4 

10824 10843 1 4 

3439 3458 1 4 

5586 5605 1 4 

12355 12374 1 4 

9938 9957 1 4 

10917 10936 1 4 

6007 6026 1 4 

8118 8137 1 4 

9024 9043 1 4 

13727 13746 1 4 

1933 1952 1 4 

2485 2504 1 4 

5519 5538 1 4 

4439 4458 1 4 

6893 6912 1 4 

9952 9971 1 4 

12045 12064 1 4 

10737 10756 1 4 

14000 14019 1 4 

13378 13397 1 4 

7273 7292 1 4 

13825 13844 1 4 

13824 13843 1 4 

8325 8344 1 4 
10185 10204 1 4 
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SEQ ID NO: 


3063 


tatccaagatctgaaaaag 


1939 


1958 SEQ | DN 0: 


4403 ctttttaaccagtcagata 


10184 


10203 1 


4 


SEQ ID NO: 


3064 


caagatctgaaaaagttag 


1943 


1962 SEQ id NO: 


4404ctaaattacoatggtcttg 


4973 


4992 1 
4991 1 


4 


SEQ ID NO: 


3065 


aagatctgaaaaagttagt 


1944 


1963 S EQ ID NO: 


4405 actaaattcccatggfctt 


4972 


4 


SEQ ID NO: 


3066 


tgaaaaagttagtgaaaga 


1950 


1969 SE Q ID NO: 


4406tctttctcgggaatattca 


10630 


10649 1 


4 


SEQ ID NO: 


3067 


tccaactgtcatggactte 


1990 


2009SEQIDNO: 


4407gaagcacatatgaaotgga 


13945 


13964 1 


4 


SEQ ID NO: 


3068 
3069 


tcagaaaattctotcggaa 


2007 


2026 SEQ id N0: 


4408ttcctttaacaattcclga 


9501 


9520 1 


4 


SEQ ID NO: 


ttccatcacttgacccagc 


2052 


2071 SEQ ID MO: 


4409 gctgacatagggaatggaa 


8441 


8460 1 


4 


SEQ ID NO: 


3070 


cccagcctcagccaaaata 


2065 


2Q 34sEQIDNO: 


441 Otattctatccaagattggg 


7820 


7839 1 


4 


SEQ ID NO: 


3071 




2068 


2087 SEQ id no: 


441 1 ttctatccaagattgggct 


7822 


7841 1 


4 


SEQ ID NO: 


3072 


atctlatatttgatccaaa 


2091 


2110SEQIDNO: 


4412tttgaaaaacaaagcagat 


11821 


11840 1 


4 


SEQ ID NO: 


3073 


tcttatatttgatccaaat 


2092 


2111 SEQ ID NO: 


4413attttttgcaagttaaaga 


14019 


14038 1 


4 


SEQ ID NO: 


3074 


cttcotaaagaaagcatgc 


2117 


2136 SEQ |DNO: 


441 4 gcatggcattatgatgaag 


3614 


3633 1 
5713 1 


4 


SEQ ID NO: 


3075 


ctaaagaaagcatgctgaa 


2121 


2140 S EQ ID NO: 


441 5ttcagggtgtggagtttag 


5694 


4 


SEQ ID NO: 


3076 


taaagaaagcatgctgaaa 


2122 


2141 S EQIDNO: 


4416tttcttaaacattccttta 


9490 


9509 


4 


SEQ ID NO: 


3077 


gagattggcttggaaggaa 


2183 


2202 SEQ , D mo: 


441 7ttccctccaltaagttctc 


11709 


11728 1 


4 


SEQ ID NO: 


3078 


ctttgagccaacattggaa 


2206 


2225 SEQ id NO: 


441 8ttccaatgaccaagaaaag 


11068 


11087 1 


4 


SEQ ID NO: 


3079 


cagacagtgtcaacaaagc 


2253 2272 SEQ id N0: 


441 9 gcttactggacgaactctg 


6142 


6161 1 


4 


SEQ ID NO: 


3080 


cagtgtcaacaaagctttg 


2257 


2276 S EQ ID NO: 


4420caaaitcctggatacactg 


9857 


9876 1 


4 


SEQ ID NO: 


3081 


agtgtcaacaaagctttgt 


2258 


2277 SE Q ID NO: 


4421 acaagaatacgtctacact 


4359 
3333 


4378 


4 


SEQ ID NO: 


3082 


ctgatggtgtctotaaggt 


2298 


2317 SEQ id no: 




3352 


4 


SEQ ID NO: 


3083 




2299 


2318SEQ ID NO: 


4423gacctgcgcaacgagatca 


8831 


8850 


4 


SEQ ID NO: 


3084 


aa^catgagca^g^atgg 


2351 


2370 SEQ |D NO: 


4424ccatgatctacatttgttt 


6796 


6815 




SEQ ID NO: 


3085 


gaagctgattaaagatttg 


2395 


2414 SEQ id NO: 


4425 caaaaacattttcaacttc 


5287 


5306 


4 


SEQ ID NO: 


3086 


aaagatttgaaatccaaag 


2405 2424 SE Q| DNO : 


4426ctttaagtlcagcatcttt 


7614 


7633 


4 


SEQ ID NO: 


3087 


gatgggtgcccgcactctg 


2518 


2537 SEQ id NO: 


4427cagatttgaggattccatc 


7983 


8002 


4 


SEQ ID NO: 


3088 


gggatcccccagatgattg 


2540 2559 SEQ | D NO: 


4428caatcacaagtcgattccc 


9083 


9102 
10401 


4 


SEQ ID NO: 


3089 


ttttcttcactaoatcttc 


2593 


2612 S EQ ID NO: 


4429gaagtgtcagtggcaaaaa 


10382 


4 


SEQ ID NO: 


3090 


tcttcactacatcttcatg 


2596 


2615 SE q id NO: 


4430catggcattatgatgaaga 


3615 


3634 


4 


SEQ ID NO: 


3091 


tacatcttcatggagaatg 


2603 


2622 SE Q ID NO: 


4431 cattatggaggcccatgta 


9445 


9464 


4 


SEQ ID NO: 


3092 


ttcatggagaatgcctttg 


2609 


2628 SEQ id NO: 


4432caaaatcaactttaatgaa 


6607 


6626 
13135 
9861 


4 


SEQ ID NO: 


3093 


tcatggagaatgcctttga 


2610 2629SEQ ID NO: 
2624 2643SEQIDNO: 


4433tcaacacaatettcaatga 
4434ctccccaggacctttcaaa 


13116 
9842 


4 
4 


SEQ ID NO: 


3094 




4435gctccccaggacctftcaa 


9841 


9860 
9859 


4 


SEQ ID NO: 
SEQ ID NO: 


3095 
3096 


ttgaactccccactggagc 
tgaactccccactggagct 


2628 


iu ^stu iu [\u: 
2645 SE Q| DN 0: 


4436agctccccaggacctttca 


9840 


4 


SEQ ID NO: 


3097 


cactggagctggaltacag 


2635 


2654 S EQ ID NO: 


4437ctgtttctgagtcccagtg 


9344 


9363 


4 


SEQ ID NO: 


3098 


actggagotggattacagt 


2636 


2655 SEQ id NO: 


4438actgtttctgagtcccagt 


9343 


9362 


4 


SEQ ID NO: 


3099 


agttgcaaatatcttcatc 


2652 2671 SEQ ID NO: 


4439gatgatgccaaaatcaact 


6599 


6618 


4 


SEQ ID NO: 


3100 


gttgcaaatatcttcatct 


2653 


2672 SEQ id NO: 


4440 agatgatgccaaaatcaac 


6598 


6617 


4 


SEQ ID NO: 


3101 


aaatatcttcatctggagt 


2658 


2677 seq id NO: 


4441 actcagaaggatggcattt 


14004 


14023 


1 4 


SEQ ID NO: 


3102 


taaaactggaagtagccaa 


2703 


2722 SEQ id NO: 








1 4 


SEQ ID NO: 


3103 


ggctgaactggtggcaaaa 


2728 2747SEQ ID NO: 


4443tmcttttcagcccagco 


9228 


9247 


1 4 


SEQ ID NO: 


3104 


tgtggagtttgtgacaaat 


2758 


2777 3E Q ID NO: 


4444attttcaagcaaatgcaca 


8538 


8557 


1 4 


SEQ ID NO: 


3105 


ttgtgacaaatatgggcat 


2766 


2785SEQIDNO: 


4445 atgcgtctaccttacacaa 


9521 


9540 


1 4 


SEQ ID NO: 


3106 


atgaacaccaaottcttcc 


2819 2838 SE qi D no: 


4446ggaagctgaagtttatcat 


2877 


2896 




SEQ ID NO: 


3107 


cttccacgagtcgggtctg 


2833 


2852 SEQ | DN 0: 


4447 cagagctatcactgggaag 


5235 


5254 


1 4 


SEQ ID NO: 


3108 


gagtcgggtctggaggctc 


2840 


2859 SEQ id NO: 


4448gagcttactggacgaactc 


6140 


6159 


1 4 


SEQ ID NO: 


3109 


cctaaaagctgggaagctg 


2866 


2885 S EQ ID NO: 


4449 cagcctccccagccgtagg 


12120 


12139 


1 4 


SEQ ID NO: 


3110 


agctgggaagotgaagttt 


2872 


2891 SEQ ID NO: 


4450 aaactgttaatttacagct 


5463 


5482 


1 4 


SEQ ID NO: 


3111 


ccagattagagctggaact 


3114 3133SEQ ID NO: 


4451 agtttccggggaaacctgg 


12726 


12745 


1 4 


SEQ ID NO: 


3112 


ggataccctgaagtttgta 


3208 


3227 SEQI DNO: 


4452tacagtattctgaaaatcc 


8393 


8412 
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SEQ ID NO: 


3113 


clgaggctaccatgacatt 


3252 3271 SE Q id NC 


; 4453aatgagctcatggcttcag 


3817 


3836 


1 4 


SEQ ID NO: 


3114 


tgtccagtgaagtcoaaat 


3297 3316 SE Q|DNC 


4454 attttgagaggaatcgaca 


6357 


6376 


1 4 


SEQ ID NO: 


3115 


aattccggattttgafgtt 


3313 3332 S EQIDNC 


4455aacacatgaatcacaaatt 


8938 


8957 


1 4 


SEQ ID NO: 


3116 


ttccggattttgatgttga 


3315 3334 SE Q] DNC 


: 4456tcaaaacgagcttcaggaa 


13207 


13226 


1 4 


SEQ ID NO: 


3117 


cggaacaatcotcagagtt 


3337 3356 S EQIDNC 


: 4457aacttgtacaactggtccg 


4211 


4230 


1 4 


SEQ ID NO: 


3118 


tcclcagagttaatgatga 


3345 3364 SEQ | DNC 


4458toatcaattggtlacagga 


7593 


7612 


1 4 


SEQ ID NO: 


3119 


ctcaccctggacattcaga 


3392 3411 seq ID NO 


: 4459tclgcagaacaatgctgag 


12439 


12458 


1 4 


SEQ ID NO: 


3120 


cattcagaacaagaaaatt 


3403 3422 SE q, DN0 


4460aattgactttgtagaaatg 


8104 


8123 


1 4 


SEQ ID NO: 


3121 


actgaggtcgccctcatgg 


3422 3441 SEQ , DN0 


: 4461 ccatgcaagtcagcccagt 


10924 


10943 


1 4 


SEQ ID NO: 


3122 


ttatttccataccccgttt 


3486 3505 SEQlDNO 


4462aaactgcctatattgataa 


13880 


13899 


1 4 


SEQ ID NO: 


3123 


gtttgcaagcagaagccag 


3501 3520 SE QIDNO 


4463ctggacttctcttcaaaac 


5408 


5427 


1 4 


SEQ ID NO: 


3124 


tttgcaagcagaagccaga 


3502 3521 S EQ| DN o 


4464tctgggtgtcgacagcaaa 


5272 


5291 


1 4 


SEQ ID NO: 


3125 


ttgoaagcagaagcoagaa 


3503 3522 SE Q|DNO 


4465ttctgggtgtcgacagcaa 


5271 


5290 


1 4 


SEQ ID MO: 


3126 


ctgcttctccaaatggact 


3554 3573 SEQ | DNO 


4466agtcaagaltgatgggcag 


4567 


4586 


1 4 


SEQ ID NO: 


3127 


tgctacagcttatggctcc 


3577 3596 SEQ | DNO 


4467ggaggctttaagttcagca 


7609 


7628 


1 4 


SEQ ID NO: 


3128 


acagcttatggctccacag 


3581 3600 SEQtDNO 


4468ctgtatagcaaattcctgt 


5897 


5916 


1 4 


SEQ ID NO: 


3129 


tttccaagagggtggcatg 


3600 3619 SEQ | DNO 


4469catggacttcttctggaaa 


8877 


8896 


1 4 


SEQ ID NO: 


3130 


ccaagagggtggcatggca 


3603 3622 SEQ |DNO 


4470tgcccagcaagcaagttgg 


9361 


9380 


4 


SEQ ID NO: 


3131 


gtggcatggcattatgatg 


3611 3630 SEQ | DNO 


4471 catccttaacaccttccac 


8071 


8090 


4 


SEQ ID NO: 


3132 


tgatgaagagaagattgaa 


3625 3644 SEQ | DNO 


4472ttcactgttcctgaaatca 


7871 


7890 


4 


SEQ ID NO: 


3133 


gaagagaagattgaatftg 


3629 3648SEQIDNO 


4473caaaaacattttcaacttc 


5287 


5306 


4 


SEQ ID NO: 


3134 


gagaagattgaatttgaat 


3632 3651 SEQIDNO 


4474 atteaiaatcccaactctc 


8278 


8297 


4 


SEQ ID NO: 


3135 




3644 3663 SE Q|DNO 


4475tgcctttgtgtacaccaaa 


11236 


11255 


4 


SEQ ID NO: 


3136 


aggcaccaatgtagataco 


3658 3677 SEQ | DN0 


4476 ggtaacctaaaaggagcct 


5591 


5610 


4 


SEQ ID NO: 


3137 


caaaaaaatgacttccaat 


367S 3695 SEQ j D N0 


4477attgaagtacctacttttg . 


8366 


8385 


4 


SEQ ID NO: 


3138 


aaaaaaatgacttccaatt 


3677 3696 SEQ )D N0 


4478aattgaagtacctactttt 


8365 


8384 


4 


SEQ ID NO: 


3139 


aaaaaatgacttccaattt 


3678 3697 SE Q] DNO 


4479 aaatccaatctcctctttt 


8406 


8425 


4 


SEQ ID NO: 


3140 


cagagtccctcaaacagac 


3760 3779 SEQ | DNO 


4480 gtctgtgggattccatctg 


4090 


41.09 


4 


SEQ ID NO: 


3141 


aaattaatagttgcaatga 


3803 3822SEQIDNO 


4481 tcataagltcaatgaattt 


13186 


13205 


4 


SEQ ID NO: 


3142 


ttcaacctccagaacatgg 


3899 3918 SEQ | DNO 


4482ccattgaccagatgctgaa 


8142 


8161 


4 


SEQ ID NO: 


3143 


igggattgccagacttcca 


3915 3934 SEQ | DNO 


4483tggaaatgggcctgcccca 


8903 


8922 


4 


SEQ ID NO: 


3144 


cagtttgaaaattgagatt 


3994 4013 SEQ | DNO 


4484aatcacaactcctccactg 


9541 


9560 


4 


SEQ ID NO: 


3145 


gaaaattgagattcctttg 


4000 4019 S EQ|DNO 


4485 caaaactaccacacatttc 


13694 


13713 


4 


SEQ ID NO: 


3146 


tttgccttttggtggcaaa 


4015 4034 SEQ | D N0 


4486tttgagaggaatcgacaaa 


6359 


6378 


4 


SEQ ID NO: 


3147 


ctccagagatotaaagatg 


4036 4055 SE Q|DNO 


4487 catcaattggttacaggag 


7594 


7613 


4 


SEQ ID NO: 


3148 


a g g gac 


4045 4064 SE Q| D |MO 


4488 agtcctteatgtccctaga 


10033 


10052 


4 


SEQ ID NO: 


3149 


ctgtgggattccatctgcc 


4092 41 11 SEQ id mo 


4489ggcattttgaaaaaaacag 


9735 


9754 1 


4 


SEQ ID NO: 


3150 


atctgccatctcgagagtt 


4104 4123 SE QIDNO 


4490aactctcaaaccctaagat 


8556 


8575 1 


4 


SEQ ID NO: 




tctcgagagttccaagtcc 


4112 4131 SE Q|DNO 


4491 ggacattcctctagcgaga 


8215 


8234 1 


4 


SEQ ID NO: 


3152 


agtccctacttttaccatt 


4126 414*Soir/-\ ir\ m/-s 
ti^w Hl ^°SEQIDNO 


4492aatgaatacagccaggact 






4 


SEQ ID NO: 


3153 


acttttaccattccoaagt 


4133 4152 SE QiDNO. 


4493actttgiagaaatgaaagt 


8109 


8128 ■ 


4 


SEQ ID NO: 


3154 


cattcccaagttgtatcaa 


4141 4160 SEQ | DNO 


4494ttgaaggacftcaggaatg 


12009 


12028 1 




SEQ ID NO: 


3155 


accacatgaaggctgactc 


4284 4303 SE QIDNO: 


4495gagtaaaccaaaacttggt 


9024 


9043 1 




SEQ ID NO: 


3156 


tttcctacaatgtgcaagg 


4317 4336 SEQ | DNO 


4496cctttaacaaltcctgaaa 


9503 


9522 1 




SEQ ID NO: 


3157 


ctggagaaacaacatatga 


4338 4357 SEQ |DNO: 


4497tcattc1gggtctttccag 


11035 


11054 1 




SEQ ID NO: 


3158 


atcatgtgatgggtctcta 


4378 4397 SEa[DNO : 


4498tagaa:tacagaaaatgat 


6565 


6584 1 


4 


SEQ ID NO: 


3159 


catgtgatgggtctctacg 


4380 4399 SEQ | DNO: 


4499cgtaggcaccgtgggcatg 


12133 


12152 1 


4 


SEQ ID NO: 


3160 


ttctagattcgaatatcaa 


4407 4426 SEQ , DNO: 


4500ttgatgatgctgicaagaa 


7308 


7327 1 




SEQ ID MO: 


3161 


tggggaccacagatgtctg 


4499 4518 SE Q|DNO: 


4501 cagaaticcagcttcccca 


8334 


8353 1 




SEQ ID NO: 


3162 




4644 4663 SE Q| DN 0: 


4502ttgaggctattgatgttag 


5984 


7003 1 


4 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID MO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3-163 taacactggccggctcaat 
3164 aacactggccggctcaatg 



3166 agataacaggaagatatga 
3 -]q 7 tccctcacctccacctctg 
3168 agctgactttaaaatctga 
3169 
3170 
3171 
3172 

3173 - - - 

3174 tggtcttgagttaaatgct 

3175 cttgagttaaatgctgaca 

3176 ttgagttaaatgctgacat 

3177 tgagttaaatgctgacatc 
3173 acttgaagtgtagtctcot 

3179 agtgtagtctcctggtgct 

3180 gtgctggagaatgagctga 
3181 
3182 



3184 

3135 aaaacattttcaacttcaa 
3186 cttaagctatcaaatgaca 
3187 



3189 — 

3ig0 actggacttctcttcaaaa 

3191 acttctcttcaaaacttga 

3192 ctgacaagttttataagca 

3193 aagttttataagcaaactg 

3194 ctgttaatttacagctaca 
31 g 5 ttacagctacagccctatt 

319 7 tttaaacagtgacctgaaa 

3198 ttaaacagtgacctgaaat 

3199 cagtgacctgaaatacaat 

3200 tgtggctggtaacctaaaa 

3201 ttatcagcaagctataaag 



3203 attcagactcactgcattt 
3204 
3205 

3206 gctgtatagcaaattcctg 
3207 



3210 gaatacagccaggacttgg 

3211 ctggacgaactctggctgs 



4645 4664 SEQ |DNO: 

4646 4665 SE QIDNO: 
4650 4669 S EQIDNO: 
4713 4732 S EQIDNO: 
4745 4764SEQIDNO: 
4818 4837 S eq|DN0: 
4820 4839SEQIDNO: 
4873 4892SEQIDNO: 
4909 4928SEQIDNO: 
4913 4932 SEQ |DNO: 
4976 4995$EQ|DNO: 
4984 5003 SEQ |DNO: 

4988 5007SEQIDNO: 

4989 5008 SEQ |DNO: 

4990 5009SEQ ID NO: 
5094 5113seq|DNO: 
5100 5119 S EQIDNO: 
5114 5133 SE q| D NO: 
5151 5170SEQIDNO: 
5178 5197 SE Q] D NO: 
5207 5226gEQiDNO: 
5265 5284seqidNO: 
5289 5308SEQIDNO: 

5324 5343 S EQIDNO: 

5325 5344SEQIDNO: 
5341 5360 SEQ iDNO: 
5346 5365gEQ|DNO: 
5407 5426SEQIDNO: 
5412 5431 SE Q |D NO: 
5445 5464SEQIDN0: 
5450 5469SEQ ID NO: 
5466 5485s E Q|DNO: 
5474 5493 SE Q ID NO: 
5494 5513s E Q|DNO: 

5506 5525SEQIDNO: 

5507 5526 SE Q|DNO: 
5512 5531 SEQ |DNO: 
5584 5603s E Q|DNO: 
5657 5676s E Q|DNO: 
5692 5711 S EQIDNO: 

5775 5794 S EQIDNO: 

5776 5795 SE Q ID NO: 
5848 5867SEQIDNO: 
5896 5915SEQIDNO: 
6043 6062 SE Q|DNO: 
6053 6072SEQIDNO: 

6088 6107sEQIDNO: 

6089 6I08 SE Q1DNO: 
6147 6166SEQ1DMO: 
6201 6220 SE Q|DNO: 



4503attgaggctattgatgtta 


6983 


7002 1 


4 


4504cattgaggctattgatgtl 


6982 


7001 1 


4 


4505tctccatctgcgctaccag 




12092 1 


4 


4506tcatctc^tottoaW 


10210 
8184 


10229 1 
8203 1 


4 
4 


4507caga g 


7930 


7949 1 


4 


4509tgtcaagataaacaatcag 








451 Ogaagtagtactgcatcttg 


6B43 


6862 1 


4 


451 1 ctgagtcccagtgcccagc 


9350 


9369 1 


4 


4512cagcaagtacctgagaacg 


8611 


8630 1 


4 


4513actcagatcaaagttaatt 


12272 


12291 1 
10828 1 


4 
4 


4514agcacagtacgaaaaacca 
4515tgtccctagaaatotcaag 


10809 


10061 1 


4 


451 6atgtccctagaaatctcaa 


10041 


1 0060 1 


4 


4517gatgg 


4733 


4752 1 


4 


4518aggaaactcagatcaaagt 




12286 1 


4 


4519agcagccagtggcaccact 


1 2*514 






4520tcagccaggtttatagcac 


7734 


1 7753 1 


4 




13579 


13598 1 






8649 




I 4 


4523 ctttgacaggcattttgaa 


9727 


9746 1 
5857 ' 


I 4 
I 4 


4524tcgatgcacalacaaatgg 
4525ttgatgttagagtgctttt 


5838 
6993 


7012 ' 






7255 


7274 ■ 


1 4 


4527 atgtcctacaacaagttaa 


7254 


7273 ' 


1 4 


4528 agcatctttggctcacatg 


7624 


7643 ' 


1 4 


4529 atttatcaaaagaagcoca 


12942 


12961 


1 4 


4530ttttggcaagctatacagt 


8380 


8399 ' 


1 4 


4531 tcaattgggagagacaagt 


6504 


6523 


1 4 


4532tgctttgtgagtttatcag 






1 4 


4533cagtcatgtagaaaaaott 


4429 


4448 


1 4 


4534tgtactggaaaacgtacag 


6388 


6407 


1 4 


45 ^ a fjj att9a,caattt9taa 


6425 


6444 


1 4 


g g 


11820 




1 4 


4537tttcatttgaaagaataaa 
4538atttcaagcaagaacttaa 


7032 
10434 


7051 
10453 
6150 


1 4 
1 4 


4539attggcgtggagcttactg 
4540ttttgctggagaagccaca 


6131 
10765 


10784 


1 4 


4541ctttgcactatgttcataa 


12764 


12783 


1 4 






9033 


1 4 


4543aaatgctgacatagggaat 


8437 


8456 


1 4 


4544gaaa1attatgaacttgaa 


13312 


13331 


1 4 


4545tttcctaaagctggatgta 


11176 


11195 


1 4 


4546caggtccatgcaagtcagc 


10919 


10938 


1 4 


4547 ccagcttccccacatctca 


8341 


8360 


1 4 


4548tcttcgtgtttcaactgcc 


11221 
9380 


11240 
9399 


1 4 
1 4 


4549 caagtaagtgctaggttca 

4550 ccaacacttacttgaattc 

4551 tcagaaagctaccttccag 
4552atggacttcttctggaaaa 


10668 
7939 
8878 


10687 
7958 
8897 


1 4 

1 4 
1 4 
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SEQ ID NO: 


3213 


gatgagagatgccgttgag 


6241 6260 SEQ | DNO 


4553ctcatctcctttcttcatc 


10209 


10228 


1 4 


SEQ ID NO: 


3214 


aatlgttgcttttgtaaag 


6277 6296 SE Q| D NO 


4554cttttctaaacttgaaatt 


9064 


9083 


1 4 


SEQ ID NO: 


3215 


cttttgtaaagtatgataa 


6285 6304 S EQ|DNO 


4555ttatgaacttgaagaaaag 


13318 


13337 


1 4 


SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 


3216 


ttlgtaaagtatgataaaa 
tccattaacctcccatttt 


6287 6306 S EQIDNO 

6320 6339 SE Q| DN o 

6321 6340SEQIDNO 


4556ttttcacattagatgcaaa 


8421 


8440 


1 4 


3217 
3218 


ccattaacctoccattttt 


4557aaaaitgatgatatctgga 
4558 aaaagggtcatggaaatgg 


10727 
8893 


10746 
8912 


1 4 

1 4 


SEQ ID NO: 


3219 


cttgcaagaatattttgag 


6346 6365SEQIDNO 


4559ctcaattttgattttcaag 


8528 


8547 


1 4 


SEQ ID NO: 


3220 


agaatattttgagaggaat 


6352 6371SEQIDNO 


4560attccctccattaagttcl 


11708 


11727 


1 4 


SEQ ID NO: 


3221 


attatagttgtactggaaa 


6380 6399 S EQ ID NO 


4561 tttcaagcaagaactfaat 


10435 


10454 


1 4 


SEQ ID NO: 


3222 


gaagcacatcaatattgat 


6415 6434 SE Q|DNO 


4562atcagttcagataaacttc 


7999 


8018 


1 4 


SEQ ID NO: 


3223 


acatcaatatfgatcaait 


6420 6439SEQIDNO 


4563aattccotgaagttgatgt 


11487 


11506 


1 4 


SEQ ID NO: 


3224 


gaaaactcccacagcaagc 


6465 6484 SEQ | DNO 


4564gctttctcttccacatttc 


10060 


10079 


4 


SEQ ID NO: 


3225 


ctgaattcattcaattggg 


6494 6513 SEQ ] DNO 


4565cccatttacagatcttcag 


11371 


11390 


4 


SEQ ID NO: 


3226 


tgaattoattcaattggga 


6495 6514 S EQlDNO 


4566tcccatttacagatcttca 


11370 


11389 


4 


SEQ ID NO: 


3227 


aactgactgctctcacaaa 


6540 6559 SEQ | DNO 


4567tttgaggattccatcagtt 


7987 


8006 


4 


SEQ ID NO: 


3228 


aaaagtatagaattacaga 


6558 6577 SEQ | DNO 


456Btctggctccctcaactttt 


9050 


9069 


4 


SEQ ID NO: 


3229 


atcaactttaatgaaaaac 


6611 6630 SEQ | D |MO 


4569gtttattgaaaatattgat 


6811 


6830 


4 


SEQ ID NO: 


3230 


tgatttgaaaatagctatt 


6694 8713SEQIDNO 


4570aatattattgatgaaatca 


6716 


6735 


4 


SEQ ID NO: 


3231 


atttgaaaatagctattgc 


6696 6715 SE QIDNO 


4571 gcasgaacttaatggaaat 


10441 


10460 


4 


SEQ ID NO: 


3232 


attgctaatattattgatg 


6710 6729 S EQ|DNO 


4572catcacactgaataccaat 


10159 


10178 


4 


SEQ ID NO: 


3233 


gaaaaattaaaaagtcttg 


6737 6756g E Q| DN0 


4573caagagcttatgggatttc 


11161 


11180 


4 


SEQ ID NO: 


3234 


actatcatatccgtgtaat 


6762 6781SEQIDNO 


4574attactttgagaaattagl 


7281 


7300 


4 


SEQ ID NO: 


3235 


(attgattttaacaaaagt 


6823 6842 SE Q|DNO 


4575acttgactteagagaaata 


11404 


11423 


4 


SEQ ID NO: 


3236 


ctgcagcagctiaagagac 


6914 6933 SEQ | DNO 


4576gtcttcagtgaagctgcag 


10699 


10718 


4 


SEQ ID NO: 


3237 


aaaacaacacatlgaggct 


6973 6992 SEQ | DNO 


4577agcctcacctcttactttt 


10571 


10590 


4 


SEQ ID NO: 


3238 


ttgagcatgtcaaacactt 


7059 7078 SEQ | DNO 


4578aagtagclgagaaaatcaa 


7104 


7123 


4 


SEQ ID NO: 


3239 


tttgaagtagctgagaaaa 


7100 7119 SEQiDNO 


4579ttttcacattagatgcaaa 


8421 


8440 


4 


SEQ ID NO: 


3240 


ttagtagagtlggcccacc 


7199 7218 SE Q|DNO 


4580ggtggactcttgctgctaa 


7776 


7795 


4 


SEQ ID NO: 


3241 


tgaaggagactattcagaa 


7227 7246 SEQ|DN0 


4581ttctcaattttgattttca 


8526 


8545 


4 


SEQ ID NO: 


3242 


gagactattcagaagctaa 


7232 7251SEQIDNO 


4582ttagccacagctctgtctc 


10301 


10320 


4 


SEQ ID NO: 


3243 


aattagttggatttattga 


7293 7312 SEQ , DNO 


4583tcaagaagcttaatgaatt 


7320 


7339 


4 


SEQ ID NO: 


3244 


gcttaatgaattatctttt 


7327 7346 SE QlDNO 


4584aaaacgagcttcaggaagc 


13209 


13228 


4 


SEQ ID NO: 


3245 


ttaacaaattccttgacat 


7365 7384SEQIDNO 


4585atgtcctacaacaagttaa 


7254 


7273 


4 


SEQ ID NO: 


3246 


aaattaaagtcatttgatt 


7394 7413 S EQ ID NO 


4586aatcctttgacaggcattt 


9723 


9742 


4 


SEQ ID NO: 


3247 


gactcaatggtgaaattca 


7464 7483 SEQ | DNO 


4587tgaaattcaatcacaagte 


9076 


9095 


4 


SEQ ID NO: 


3248 


gaaattcaggctotggaac 


7475 7494 SE Q|DNO 


4588gltctcaattttgattttc 


8525 


8544 


4 


SEQ ID NO: 


3249 


actaccacaaaaagctgaa 


7492 7511 SEQ 1DN0 


4589ttcaggaactattgctagt 


10645 


10664 


4 


SEQ ID NO: 


3250 


ccaaaataacctlaatcat 


7578 7597 SE Q|DNO 


4590 atgatttccctgaccttgg 


10950 


10969 


4 


SEQ ID NO: 


3251 


aaataaccttaatcatcaa 


7581 7600 SE Q|DNO 


4591 ttgaagtaaaagaaaattt 


10749 


10768 


4 


SEQ ID NO: 


3252 


tttaagttcagcatctttg 


7615 7634 SE Q|DNO 


4592caaatctggatttcttaaa 


9480 


9499 


4 


SEQ ID NO: 


3253 


caggtttatagcacacttg 


7739 7758 SEQ | DNO 


4593 caagggttcactgttcotg 


7865 


7884 1 


4 


SEQ ID NO: 


3254 


gttcactgttcctgaaatc 


7870 7889 SE Q|DNO 




8922 


8941 1 




SEQ ID NO: 


3255 


cactgttcctgaaatcaag 


7873 7892 SEQ|DNO 


4595ottgaacacaaagtoagtg 


6008 


6027 1 




SEQ ID NO: 


3256 


actgttcctgaaatcaaga 


7874 7893 SE Q|DNO 


4596tcttgaacacaaagtcagt 


6007 


6026 1 




SEQ ID NO: 


3257 


gcctgcctttgaagtcagt 


7909 7928 SEQ | DNO 


4597actgt(gactcaggaaggc 


12580 


12599 1 


4 


SEQ ID NO: 


3258 




7980 7999SEQ ID NO 


4598 ggaagcttctcaagagtta 


13222 


13241 1 




SEQ ID NO: 


3259 




8050 8069 SEQ | DNO 


4599 aaatttctctgctggaaac 


9418 


9437 1 


4 


SEQ ID NO: 


3260 


fcagaaccattgaccagat 


8136 8155 SE Q|DNO- 


4600atctgcagaacaatgctga 


12438 


12457 1 


4 


SEQ ID NO: 


3261 


tagcgagaatcaccctgcc 


8226 8245 SE Q|DNO- 


4601 ggcagcttctggcttgcta 


12301 


12320 1 


4 


SEQ ID NO: 


3262 


cctlaatgattttcaagtt 


8299 8318 S EQIDNO 


4602 aactgttgactcaggaagg 


12579 


12598 1 


4 
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SEQ ID NO: 3263 aoataccagaattccagcl 

SEQ ID NO: 3264 aatgctgaoatagggaatg 

SEQ ID NO: 3265 atgctgacatagggaatgg 

SEQ ID NO: 3266 aaccacctcagcaaacgaa 

SEQ ID NO: 3267 agcaggtatcgcagcttcc 

SEQ ID NO: 3268 tgcacaactctcaaaccct 

SEQ ID NO: 3269 aggagtcagtgaagttctc 

SEQ ID NO: 3270 tttttggaaatgccattga 

SEQ ID NO: 3271 aatggagtgattgtcaaga 

SEQ ID NO: 3272 gtcaagataaacaatcagc 

SEQ ID NO: 3273 tccacaaattgaacatcoc 

SEQ ID NO: 3274 ttgaacatccccaaactgg 

SEQ ID NO: 3275 acatccccaaactggactt 

SEQ ID NO: 3276 acttctctagtcaggctga 

SEQ ID NO: 3277 tgaatcacaaattagtttc 

SEQ ID NO: 3278 agaaggacccctcacttcc 

SEQ ID NO: 3279 itggactgtccaataagat 

SEQ ID NO: 3280 actgtccaataagatcaat 

SEQ ID NO: 3281 ctgtccaataagatcaata 

SEQ ID NO: 3282 gtttatgaatctggctccc 

SEQ ID NO: 3283 atgaatctggctccctcaa 

SEQ ID NO: 3284 ctcaacttttctaaacttg 

SEQ ID NO: 3285 ctaaaggcatggcactgtt 

SEQ ID NO: 3286 aaggcatggcactglttgg 

SEQ ID NO: 3287 atccacaaacaatgaaggg 

SEQ ID NO: 3288 ggaatttgaaagttcgttt 

SEQ ID NO: 3289 aataactatgcactgtttc 

SEQ ID NO: 3290 gaaacaacgagaacattat 

SEQ ID NO: 3291 ttcttgaaaacgacaaagc 

SEQ ID NO: 3292 ataagaaaaacaaacacag 

SEQ ID NO: 3293 aaaacaaacacaggcattc 

SEQ ID NO: 3294 gcattocatcacaaatcct 

SEQ ID NO: 3295 ttlgaaaaaaacagaaaca 

SEQ ID NO: 3296 caatgcattagattttgtc 

SEQ ID NO: 3297 caaagctgaaaaatatcag 

SEQ ID NO: 3298 cctggatacactgttccag 

SEQ ID NO: 3299 gttgaagtgtctccattca 

SEQ ID NO: 3300 tttctccatcctaggttct 

SEQ ID NO: 3301 ttctccatcctaggttctg 

SEQ ID NO: 3302 tcattagagctgcoagtcc 

SEQ ID NO: 3303 tgctgaactffltaaccag 

SEQ ID NO: 3304 ctcctttcttcatcttcat 

SEQ ID NO: 3305 tgtaattgatgcactgcag 

SEQ ID NO: 3306 tgatgcactgcagtacaaa 

SEQ ID NO: 3307 agctctgtctctgagcaac 

SEQ ID NO: 3308 agccgaaattccaattttg 

SEQ ID NO: 3309 ttgagaatgaatttcaagc 

SEQ ID NO: 3310 aaacctactgtctcttcct 

SEQ ID NO: 3311 tacttttccattgagtcat 

SEQ ID NO: 3312 tcaggtcoatgoaagtcag 



8328 8347 S EQ|DNO: 


IfiM 39 ^ 91 ^ 8 ? ccatcatt 


10026 


10045 1 


4 


8438 8457SEQ ID NO: 




10005 


10024 1 


4 


8439 8458SEQIDNO: 


4605cca gagacacggca 




9264 1 


4 


8458 8477 SE Q|DNO: 


4606tlcgttttccatfaaggtt 


9291 


4 


8476 8495sEQIDNO: 


4607ggaagtggcoctgaatgct 


10972 


10991 1 


4 


8551 8570 S EQIDNO: 


46D8agggaaagagaagattgca 


13501 


13520 1 
13807 1 




8592 8611 SEQ ID NO: 


4609gagaacttactatcatcct 


13788 


4 


8652 8671 seq ID NO: 


461 Otcaatgaattiattcaaaa 




1 321 3 1 


4 


8729 8748sEQlDNO: 


Iri ^ tct | Hca ^ :cca9C 'j att 


9231 


9250 1 


4 


8741 8760SEQIDNO: 






4838 1 


4 


8787 8806seQ ID NO: 


4613gggatttcctaaagctgga 


11172 


11191 1 


4 


8795 8814QEQIDNO: 


461 4 ccagtttccagggactcaa 




12622 1 
9109 1 


4 


8799 881 8 SEQ ID NO: 


46 1 5 aagtcgattcccagcatgt 


9090 


4 


8814 8833sEQlDNO: 


,„,, 0398 ? 93 , 


11010 


11029 1 


4 


8944 8963SEQIDNO: 


4617gaaagtccataatggttca 




12836 1 


4 


8968 8987SEQIDNO: 


4618ggaagaagaggcagcttct 


12292 


12311 1 


4 


8988 9007SEQIDNO: 


461 9atctaaatgcagtagccaa 


11634 




4 


8992 9011 seq |D NO: 


4620 attgataaaaccatacagt 


13891 


13910 • 


4 


8993 9012sEQIDNO: 


4621 tattgataaaaccatacag 


13890 


13909 1 
12274 


4 


9041 9060SEQ1DNO: 


4622 gggaatctgatgaggaaac 




4 


9045 9064SEQ ID NO: 


4623 ttg agttgccoacoataat 


11667 


11686 
1 9768 


4 


9059 9078 SE Q|DNO: 


4624caagatcgcagactttgag 


1 9749 


4 


9129 9148SEQIDNO: 


4625aacagaaacaatgcattag 








9132 9151 SE Q|DNO: 


4626ccaagaaaaggcacacctt 


11077 


11096 


4 


9262 9281 SEQ ID NO: 


4627ccctaacagatttgaggat 


7977 


7996 


4 


9279 929Bseq |p NO: 


4628 aaacaaacacaggcattcc 


9655 


9674 


4 


9332 9351 SEQ ID NO: 


463D 9 f aa ac1gcaagatttttc 


12836 


12855 


4 


9432 9451 seq ID NO: 




13608 




9599 9B18SEQ ID NO: 


4631 gctttccaatgaccaagaa 


11065 


11084 


4 


9648 9667SEQ ID NO: 


4632ctgtgctttgtgagtttat 






4 


9654 9673 SEQ ID NO: 


4633gaatttgaaagttcgtttt 


9280 


9299 


4 


9667 9686SEQ ID NO: 


4634aggaagtggccctgaatgc 


10971 


10990 


1 4 


9740 9759sEQ ID NO: 


4635tgttgaaagatttatcaaa 


12933 


12952 


1 4 


9757 9776 SEQ ID NO: 




10279 


10298 


1 4 


9817 9836SEG ID NO: 


4637^?gaTcteatcatttg a 


11438 


11457 


1 4 


9863 9882 seq ID NO: 


4638 ctggacttctctagtcagg 


8810 


8829 
9065 


1 4 


9890 9909SEQ ID NO: 


4 Rdn t9aat t t99Ct< t° CtCaa aaa 


9046 


1 4 


9964 9983sEQIDNO: 




6893 


6912 


1 4 


9965 9984SEQ ID NO: 


4641 cagaatccagatecaagaa 


6892 


6911 


1 4 


10019 10038sEQ ID NO: 


4642ggacagtgaaatattatga 


13305 


13324 


1 4 


1017710196SEQIDNO: 


4643 ctggatgtaaccaccagca 


11186 


11205 


1 4 


10214 10233 SE Q ID NO: 


4644 atgaagcttgctccaggag 


13772 


13791 


1 4 


1023410253SEQIDNO: 


4645 ctgcgctaccagaaagaca 


12080 


12099 


1 4 


1024010259 SE Q|DNO: 


4646 tttgagttgcccaccatca 


11666 


11685 


1 4 


10309 10328 SEQ ID NO: 


4647 gftgaccacaagcttagot 


10547 


10566 


1 4 


10408 10427seq ID NO: 


4648 caaagctggcaccagggct 


13971 


13990 
13235 


1 4 


1042410443 SE Q|DNO: 


4649 gcttcaggaagcttctcaa 


13216 


1 4 


10469 10488 SEQ ID NO: 


4650aggaaggccaagccagttt 


12591 


12610 


1 4 


1058310602SEQIDNO: 


4651 atgattatgtcaacaagta 


12363 


12382 


1 4 


10918 10937 seq id NO: 




5001 


5020 


1 4 
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SEQ ID NO: 


3313 


atgcaagtcagcccagttc 


10926 10945SEQ ID NO: 


4653gaactcagaaggatggcat 


14002 


14021 


1 


4 


SEQ ID NO: 


3314 


tgaatgctaacactaagaa 


1098311002SEQIDNO: 


4654ttctcaattttgattttca 


8526 


8545 


1 


4 


SEQ ID NO: 


3315 
3316 


agaagatcagatggaaaaa 


1100411023 SE Q1DNO: 


4655ttttctaaatggaacttct 


12173 


12192 


1 


4 


SEQ ID NO: 


ggctattcattctccatcc 


11264 11283 SEQ , D no: 




11632 


11651 


1 


4 


SEQ ID NO: 


3317 


aaagttttggctgataaat 


1128811307SEQIDNO: 


4657atttcttaaacattccttt 


9489 


9508 


1 


4 


SEQ ID NO: 


3318 


agttttggctgataaattc 


1129011309SEQ1DNO: 


4658gaatctggctccctcaact 


9047 


9066 
11054 


1 


4 


SEQ ID NO: 


3319 


ctgggctgaaactaaatga 


11316 11335 SE Q ID NO: 


4659tcattctgggfctttccag 


11035 


1 


4 


SEQ ID NO: 


3320 


cagagaaatacaaatcfat 


1141311432 S EQIDNO: 


4660atagcatggacttctlctg 


8873 


8892 
12325 


1 


4 


SEQ ID NO: 


3321 


gaggtaaaattccctgaag 


1148011499 S EQ|DNO: 


4662ctlctggcttgctaacctc 


12306 


1 


4 


SEQ ID NO: 


3322 


cttttttgagataaccgtg 


11545 11 564 SEQ | D NO: 


4663cacggagttactgaaaaag 


13723 


13742 


1 


4 


SEQ ID NO: 


3323 


gctggaattgtcattcctt 


1173511754SEQ ID NO: 


4664aaggcatctccacctcagc 


12102 


12121 


1 


4 


SEQ ID NO: 


3324 


gtgtataatgccacttgga 


11795 11 814seq |D NO: 


4665tccaagatgagatcaacac 


13104 


13123 


1 


4 


SEQ ID NO: 


3325 


attccacatgcagctcaac 


1185911878 S EQIDNO: 


4666gttgagaagccccaagaat 


6254 


6273 


1 


4 


SEQ ID NO: 


3326 


tgaagaagatggcaaattt 


11 992 12011 SE Q ID NO: 


46S7aaattctcttttcttttca 


9220 


9239 
12688 


1 


4 


SEQ ID NO: 


3327 


atcaaaagcccagcgttca 


12050 12069QEQ ID NO: 


4668tgaaagtcaagcatctgat 


12669 


1 


4 


SEQ ID NO: 


3328 


gtgggcatggatatggatg 


12143 12162 S EQ ID NO: 


4669 catccttaacaccttccac 


8071 


8090 


1 


4 


SEQ ID NO: 


3329 


aaatggaacttctactaca 


12179 12198 SE Q ID NO: 


4670tgtaccataagccatattt 


10088 


10107 


1 


4 


SEQ ID NO: 


3330 




12219 12238 SEQ |D NO: 


4671ttgatgttagagtgctttt 


6993 


7012 


1 


4 


SEQ ID NO: 


3331 


ctgagaagaaatctgcaga 


12428 12447 SEQ ID NO: 


4672tctgcacagaaatattcag 


13447 


13466 


1 


4 


SEQ ID NO: 


3332 


acaatgctgagtgggttta 


1244712466 SEQ | DNO ; 


4673taaatggagtctttattgt 


14086 


14105 


1 


4 


SEQ ID NO: 


3333 


caatgctgagtgggtttat 


12448 12467 SEQ | D NO: 


4674ataaatggagtctttattg 


14085 


14104 


1 


4 


SEQ ID NO: 


3334 


ttaggcaaattgatgatat 


12477 12496 seq id NO: 


4675 atattgtcagtgcctctaa 


13392 


13411 


1 


4 


SEQ ID NO: 


3335 


ataaactaatagatgtaat 


12897 12916SEQ ID NO: 


4676attactatgaaaaatttat 


13641 


13660 


1 


4 


SEQ ID NO: 


3336 


ccaactaatagaagataac 


13039 13058 SEQ ID NO: 


4677gttattttgctaaacttgg 


14052 


14071 


1 


4 


SEQ ID NO: 


3337 


ttaattatatccaagatga 


1309513114SEQIDNO: 


4678tcatcctctaatltlttaa 


13800 


13819 


1 


4 


SEQ ID NO: 


3338 


tttaaattgttgaaagaaa 


13151 13170 SE Q| DN O; 




7032 


7051 


1 


4 


SEQ ID NO: 


3339 


aagttcaatgaatttattc 


13190 13209SEQ ID NO: 


4680gaataccaatgctgaactt 


10168 


10187 


1 


4 


SEQ ID NO: 


3340 


ttgaagaaaagatagtcag 


13326 13345 SEQ |D NO: 


4681 ctgagagaagtgtcttcaa 


12407 


12426 


1 


4 


SEQ ID NO: 


3341 


actlccattctgaatatat 


13377 13396SEQ ID NO: 


4682 atatctggaaccttgaagt 


10737 


10756 


1 


4 


SEQ ID NO: 


3342 


cacagaaatattcaggaat 


13451 13470 SE Q ID NO: 


4683attccctgaagttgatgtg 


11488 


11507 


1 


4 


SEQ ID NO: 


3343 


ccattgcgacgaagaaaat 


13560 13579 seq |D NO: 


4684atttttatlcctgccatgg 


10103 


10122 


1 


4 


SEQ ID NO: 


3344 


tataaactgcaagattttt 


1360713626 SE Q1DNO: 


4685aaaattcaaactgcctata 


13873 


13892 
6455 


1 


4 


SEQ ID NO: 


3345 


tctgatlaotatgaaaaat 


13637 13656SEQ ID NO: 


4686atttgtaagaaaatacaga 


6436 


1 


4 


SEQ ID NO: 


3346 


ggagttactgaaaaagctg 


1372613745 SEQ |DNO: 


4687cagcatgcctagtttctcc 




9971 


1 


4 


SEQ ID NO: 


3347 


tgaagcttgctccaggaga 


1377313792SEQIDNO: 


4688tctcctttcttcatcttca 


10213 


10232 


1 


4 


SEQ ID NO: 


3348 
3349 


tgaactggacctgcaccaa 


13955 13974 SE q id NO: 


4689ttggtagagcaagggttca 




1 


4 


SEQ ID NO: 


ttgctaaacttgggggagg 


14058 14077seq ID NO: 


4690 cctcctacagtggtggcaa 


4230 


4249 


1 


4 


SEQ ID NO: 


3350 


gattcgaatatcaaattca 


4412 4431 seq |D NO: 


4691 tgaaaacgacaaagcaatc 


9603 


9622 


3 


3 


SEQ ID NO: 


3351 


atttgtttgtcaaagaagt 


4551 4570SEQIDNO: 


4692acttltctaaacttgaaat 


9063 


9082 


3 


3 


SEQ ID NO: 


3352 


tctcggttgctgccgctga 


33 52 seq id NO: 












SEQ ID NO: 


3353 


gctgaggagcccgcccagc 


47 66 seq ID NO: 


4694 gctggatgtaaccaccago 


11185 


11204 


2 


3 


SEQ ID NO: 


3354 


ctggtctgtccaaaagatg 


227 246 SE q id NO: 


4695 catcagaaccattgaccag 


8134 


8153 


2 


3 


SEQ ID NO: 


3355 


ctgagagttccagtggagt 


291 310SEQ ID NO: 


4696actcaatggtgaaattcag 


7465 


7484 


2 


3 


SEQ ID NO: 


3356 


cagtgcaccctgaaagagg 


404 423 S EQIDNO: 


4697 cctcaottootttggactg 


8977 


8996 
11418 


2 


3 


SEQ ID NO: 


3357 


ctctgaggagtttgctgca 


472 491 seq ID NO: 


4698tgcaaacttgacttcagag 


11399 
7050 


2 


3 


SEQ ID NO: 


3358 


acatcaagaggggcatcat 


582 601 SE Q ID NO: 


4699algacgttcttgagcatgt 


7069 


2 


3 


SEQ ID NO: 


3359 


ctgatcagcagcagccagt 


830 849 SE Q ID NO: 


4700 actggacttctctagteag 


8809 
11354 


8828 
11373 


2 


3 


SEQ ID MO: 


3360 


ggacgctaagaggaagcat 


865 884 SEQ | DN o: 


4701 atgcctaogttccatgtcc 


2 


3 


SEQ ID NO: 


3361 


agctgttttgaagactctc 


1087 1106SEQIDNO: 


4702 gagaagtgtcttcaaagct 


12411 


12430 


2 


3 


SEQ ID NO: 


3362 


tgaaaaaactaaccatotc 


1113 1132 SE QlDNO: 


4703 gagatoaacacaatcttca 


13112 


13131 


2 


3 
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SEQ ID NO: 


3363 


ctgagctgagaggcctcag 


1176 


1195 S EQIDNO 


4704 ctgaattactgcacctcag 


3035 


3054 


2 


3 


SEQ ID NO: 


3364 


tgaaacgtgtgcatgccaa 


1311 


1330 SEQ | D N0 


4705ttggtagagcaagggttca 


7856 


7875 


2 


3 


SEQ ID NO: 


3365 


ccttgtatgcgctgagcca 


1440 


1459 SEQ id N0 


4706tggcactgtttggagaagg 


9138 


9157 


2 


3 


SEQ ID NO: 


3366 


aggagctgctggacattgc 


1500 


1519 SEQ |DNO 


4707 gcaagtcagcccagttcot 


10928 


10947 


2 


3 


SEQ ID NO: 


3367 


atttgattctgcgggtcat 


1575 


15S4 SEQ | D N0 


4708atgaaaccaatgacaaaat 


7428 


7447 


2 


3 


SEQ ID NO: 


3368 


tccagaactcaagtcttca 


1627 


1646 SEQ id NO 


4709tgaaatacaatgctctgga 


5520 


5539 


2 


3 


SEQ ID NO: 


3369 


ggttcttcttcagaclttc 


1744 


1763SEQ ID NO 


471 Ogaaataccaagtcaaaacc 


10455 


10474 


2 


3 


SEQ ID MO: 


3370 


gttgatgaggagtccttca 


1810 


1829 SEQ id NO 


471 1 tgaaaaagctgcaatcaac 


13734 


13753 


2 


3 


SEQ ID NO: 


3371 


tccaagatotgaaaaagtt 


1941 


1960 SEQ | DN0 


471 2aactgcttctccaaatgga 


3552 


3571 


2 


3 


SEQ ID NO: 


3372 


agttagtgaaagaagttct 


1956 


1875 SEQ id NO 


471 3agaattcataatcccaact 


8275 


8294 


2 


3 


SEQ ID NO: 


3373 


gaagggaatcttatatttg 


2084 


2103 SEQ id NO 


471 4caaaacctactgtctcttc 


10467 


10486 


2 


3 


SEQ ID NO: 


3374 


ggaagctctttttgggaag 


2221 


2240SEQ ID NO 


471 5cttcacataccagaattcc 


8324 


8343 


2 


3 


SEQ ID NO: 


3375 


tggaataatgctcagtgtt 


2374 


2393 SEQ ID NO 


471 6aacaaacacaggcattcca 


9656 


9675 


2 


3 


SEQ ID NO: 


3376 


gatttgaaatccaaagaag 


2408 


2427 SEQ ID NO 


471 7cttcatgtccctagaaatc 


10037 


10056 


2 


3 


SEQ ID NO: 


3377 


tccaaagaagtcccggaag 


2417 


2436 SEQ id NO 


471 8cttcagcctgctttctgga 


4951 


4970 


2 


3 


SEQ ID NO: 


3378 


aggaagggctcaaagaatg 


2570 


2589 SEQ id NO 


471 9cattagagctgccagtcct 


10020 


10039 


2 


3 


SEQ ID NO: 


3379 


agaatgacttttttcttca 


2583 


2602 SEQ ID NO 


4720tgaagatgacgacttttct 


12160 


12179 


2 


3 


SEQ ID NO: 


3380 


tttgtgacaaatatgggca 


2765 


2/84 SEQ id NO 


4721 tgccagittgaaaaaoaaa 


11815 


11834 


2 


3 


SEQ ID NO: 


3381 


ctgaggctaccatgacatt 


3252 


3271 SEQ ID NO 


4722aatgtcagctcttgttcag 


10903 


10922 


2 


3 


SEQ ID NO: 


3382 


gtagataccaaaaaaatga 


3668 


3687 SEQ id NO 


4723tcatttgccctcaacctac 


11450 


11469 


2 


3 


SEQ ID NO: 


3383 


aaatgacttccaatttccc 


3681 


3700 SEQ ID NO 


4724gggaactgttgaaagatlt 


12927 


12946 


2 


3 


SEQ ID NO: 


3384 


atgacttccaatttccctg 


3683 


3702 SEQ ID NO 


4725caggagaacttactatcat 


13785 


13804 


2 


3 


SEQ ID NO: 


3385 


atctgccatctcgagagtt 


4104 


412 3sEQ ID NO 


4726 aactcctccactg aaagat 


9547 


9566 


2 


3 


SEQ ID NO: 


3386 


atttgtttgtcaaagaagt 


4551 


457 °SEQ ID NO 


4727acttccgtttaccagaaat 


8247 


8266 


2 


3 


SEQ ID NO: 


3387 


gcagagcltggcctctctg 


5135 


5154 SEQIDNO 


4728cagagctttctgccactgc 


13518 


13537 


2 


3 


SEQ ID NO: 


3388 


atatgctgaaatgaaattt 


5353 


5372 SE Q ID NO 


4729aaattcaaactgcctatat 


13874 


13893 


2 


3 


SEQ ID NO: 


3389 


tcaaaacttgacaacattt 


5420 


5439 SEQ id no 


4730 aaatacttccacaaattga 


8780 


8799 


2 


3 


SEQ ID NO: 


3390 


cagtgacctgaaatacaat 


5512 


5531 s E Q ID NO 


4731 attgaacatccccaaaclg 


8794 


8813 


2 


3 


SEQ ID NO: 


3391 


tacaaatggcaatgggaaa 


5848 


5867 SE Q id NO 


4732tttcaactgcctttgtgta 


11229 


11248 


2 


3 


SEQ ID NO: 


3392 


cttttgtaaagtatgataa 


6285 


6304 SEQ ID NO 


4733ttattgctgaatccaaaag 


13656 


13675 


2 


3 


SEQ ID NO: 


3393 


ttgtaaagtatgataaaaa 


6288 


6307 SE Q ID NO 


4734ttttcaagcaaatgcacaa 


8539 


8558 


2 


3 


SEQ ID NO: 


3394 


tccattaacctcccalttt 


6320 


6339 SEQ ID NO 


4735aaaagaaaatttlgctgga 


10756 


10775 


2 


3 


SEQ ID NO: 


3395 


gattatctgaattcattca 


6488 


6507 SEQ ID NO 


4736tgaagtagaccaacaaatc 


7162 


7181 


2 


3 


SEQ ID NO: 


3396 


aattgggagagacaagttt 


6506 


6525 S EQ ID NO 


4737aaactaaatgatctaaatt 


11324 


11343 


2 


3 


SEQ ID NO: 


3397 


atttgaaaatagctattgc 


6696 


6715 SEQIDNO 


4738 gcaattlctgcacagaaat 


13441 


13460 


2 


3 


SEQ ID NO: 


3398 


tgagcatgtcaaacacttt 




7079 SEQ |D NO 


4739 aaagccattcagtctctca 


12971 


12990 


2 


3 


SEQ ID NO: 


3399 


ttgaagatgttaacaaatt 


7356 


7375 SEQ ID NO 


4740 aattccatatgaaagtcaa 


12660 


12679 


2 


3 


SEQ ID NO: 


3400 


acttgtcacctacatttct 


7753 


7772 SEQ ID NO 


4741 agaatattttgatccaagt 


13276 


13295 


2 


3 


SEQ ID NO: 


3401 


gttttccacaccagaattt 


^ 50 


8069 SEQ id NO 


4742 aaatctggatttcttaaac 


9481 


9500 


2 


3 


SEQ ID NO: 


3402 


ataagtacaaccaaaattt 




9 4 24s E Q ID NO 


4743aaataaatggagtctttat 


14083 


14102 


2 


3 


SEQ ID NO: 


3403 


cgggacctgcggggctgag 


8 


27 SEQ ID NO 


4744 ctcagttaactgtgtcccg 






1 


3 


SEQ ID NO: 






25 


44 SEQ ID NO 


4745 agcatctgattgactcact 


12678 


12697 






SEQ ID NO: 


3405 




47 


66SEQ ID NO 


4746gotgattgaggtgtccagc 


1225 


1244 


1 


3 


SEQ ID NO: 


3406 


gaggagcccgcccagccag 


50 


69 SEQ ID NO 


4747ctggatcacagagtccctc 


3752 


3771 


1 


3 


SEQ ID NO: 


3407 


gggccgcgaggccgaggcc 


72 


91 SEQ ID NO 


4748 ggccctgatccccgagccc 


1363 


1382 


1 


3 


SEQ ID NO: 


3408 


ccaggccgcagcccaggag 




108 SEQ ID NO 


4749ctcccggagccaaggctgg 


2682 


2701 


1 


3 


SEQ ID NO: 


3409 


ggagccgccccaccgcagc 


104 


123 SEQID NO 


4750 gctgttttgaagactctcc 


1088 


1107 


1 


3 


SEQ ID NO: 


3410 


gaagaggaaatgctggaaa 


200 


219 SEQ ID NO 


4751 tttcaagttcctgaccttc 


8309 


8328 


1 


3 


SEQ ID NO: 


3411 




237 


2 56SEQ ID NO 


4752aatcttattggggattttg 


7085 


7104 


1 


3 


SEQ ID NO: 


3412 




253 


272 SEQ ID NO 


4753 cttccacatttcaaggaat 


10067 


10086 


1 


3 
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SEQ ID NO: 


3413 


gttccagtggagtccctgg 


297 


31 6 SEQ ID NO 


4754ccagcaagtacctgagaac 


8610 


8629 


1 3 


SEQ ID NO: 


3414 


gactgctgattcaagaagt 


316 


335 SEQ id no 


4755acttgaagaaaagatagtc 


13324 


13343 


1 3 


SEQ ID NO: 


3415 


gtgccaccaggatcaactg 


333 


352 SEQ ID NO 


4756 cagtgaagctgcagggcac 


10704 


10723 


1 3 


SEQ ID NO: 


3416 


gatcaactgcaaggttgag 


343 


362 S EQ ID NO 


4757ctcacctccacctctgatc 


4748 


4767 


1 3 


SEQ ID NO: 


3417 


actgcaaggttgagctgga 


348 


367 SE Q ID NO 


4758tccactcacatcotocagt 


1289 


1308 


1 3 


SEQ ID NO: 


3418 


ccagctctgcagcttcatc 


373 


392 SEQ ID NO 


4759gatgtggtcacctacctgg 


1343 


1362 


1 3 


SEQ ID NO: 


3419 


agctlcatcctgaagacca 


383 


402 SEQ ID NO 


4760tggtgctggagaatgagct 


5112 


5131 


1 3 


SEQ ID NO: 


3420 


cttcatcctgaagaccago 


385 


404 SEQ ID NO 


4761 gctggagtaaaactggaag 


2696 


2715 


1 3 


SEQ ID NO: 


3421 


ccagccagtgcaccctgaa 


399 


418SEQ ID NO 


4762ttcaagatgactgcactgg 


1539 


1558 


1 3 


SEQ ID NO: 


3422 


cagtgcaccctgaaagagg 


404 


423 SEQ ID NO 


4763 cctcacagagctatcactg 


5230 


5249 


1 3 


SEQ ID NO: 


3423 


tggcttcaaccctgagggc 


427 


446 SEQ ID NO 


4764gcccactggtcgcctgcca 


3533 


3552 


1 3 


SEQ ID NO: 


3424 


cttcaaccctgagggcaaa 


430 


^SEQ ID NO 


4765tttgagccaacattggaag 


2207 


2226 


1 3 


SEQ ID NO: 


3425 


ttcaaccctgagggcaaag 


431 


45 0SEQ ID NO 


4766ctttgacaggcatlttgaa 


9727 


9746 


1 3 


SEQ ID NO: 


3426 


cttgctgaagaaaaccaag 


451 


470 SEQIDNO 


4767cttgaaattcaatcacaag 


9074 


9093 


3 


SEQ ID NO: 


3427 


tgctgaagaaaaccaagaa 


453 


472 SEQ ID NO 


4768ttctgctgccttatcagca 


5647 


5666 


3 


SEQ ID NO: 


3428 


ttgctgcagccatgtccag 


483 


502 SEQ ID NO 


4769ctggtcagtttgcaagcaa 


3004 


3023 


3 


SEQ ID NO: 


3429 


tgctgcagccatgtccagg 


484 


503 SEQ ID NO 


4770 cctggtcagtttgcaagca 


3003 


3022 


3 


SEQ ID NO: 


3430 


agccatgtccaggtatgag 


490 


509 SEQ ID NO 


4771 ctcacatcctccagtggct 


1293 


1312 


3 


SEQ ID NO: 


3431 


agctcaagctggccattcc 


507 


526 SEQ ID NO 


4772ggaactaccacaaaaagct 


7489 


7508 


3 


SEQ ID NO: 


3432 


agaagggaagcaggttttc 


526 


545 SEQ ID NO 


4773gaaatcttcaatttattct 


13821 


13840 


3 


SEQ ID NO: 


3433 


aagggaagcaggttttcct 


528 


547 SEQ ID NO 


4774aggacaccaaaataacctl 


7572 


7591 


3 


SEQ ID NO: 


3434 


agaaagatgaacctactta 


555 


574SEQ ID NO 


4775taagaactttgccaottct 


4852 


4871 


3 


SEQ ID NO: 


3435 


atoctgaacatcaagaggg 


575 


59 4 SEQ ID NO 


4776 ccctaacagatttgaggat 


7977 


7996 


3 


SEQ ID NO: 


3436 


tcctgaacatcaagagggg 


576 


595 SEQIDNO 


4777cccctaacagatttgagga 


7976 


7995 


3 


SEQ ID NO: 


3437 


ctgaacatcaagaggggca 


578 


597 SEQ ID NO 


4778 tgcctgcctttgaagtcag 


7908 


7927 


3 


SEQ ID NO: 


3438 


aacatcaagaggggcatca 


581 


600 SEQ ID NO 


4779tgataaaaaccaagatgtt 


6298 


6317 


3 


SEQ ID NO: 


3439 


acatcaagaggggcatcat 


582 


501 SEQ ID NO 


4780atgataaaaaccaagatgt 


6297 


6316 


3 


SEQ ID NO: 


3440 


tcatttctgccctcctggt 


597 


616 SEQ ID NO 


4781 accaccagtttgtagatga 


7413 


7432 


3 


SEQ ID NO: 


3441 


ttcccccagagacagaaga 


615 


^34 SEQ ID NO 


4782tcttccacatttcaaggaa 


10066 


10085 


3 


SEQ ID NO: 


3442 


gaagaagccaagcaagtgt 


629 


648 SEQ ID NO 


4783 acaccttccacattccttc 


8079 


8098 


3 


SEQ ID NO: 


3443 


ttgtttctggataccgtgt 


647 


6S6SEQ ID NO 


4784 acactaaatacttccacaa 


8775 


8794 


3 


SEQ ID NO: 


3444 


tgtatggaaactgctccac 


663 


S82 SEQ id N0 


4785gtggaggcaacacattaca 


2928 


2947 


3 


SEQ ID NO: 


3445 


aaactgctccactcacttt 


670 


S89 SEQ ID NO 


4786aaagaaacagcatttgttt 


4540 


4559 


3 


SEQ ID NO: 


3446 


actcactttaccgtcaaga 


680 


599 SEQ ID NO 


4787tcttacttttccattgagt 


10580 


10599 


3 


SEQ ID NO: 


3447 


ctttaccgtcaagacgagg 


685 


704 SEQ ID NO 


4788 cctccagctcctgggaaag 


2491 


2510 


3 


SEQ ID NO: 


3448 


ttaccgtcaagacgaggaa 


687 


7 °6sEQ ID NO 


4789ttcctaaagctggatgtaa 


11177 


11196 


3 


SEQ ID NO: 


3449 


acgaggaagggcaatgtgg 


698 


717 SEQ ID NO 


4790ccacaagtcatcatctcgt 


5964 


5983 1 


3 


SEQ ID NO: 


3450 


cgaggaagggcaatgtggc 


699 


718 SEQ ID NO 


4791 gccagaagtgagatcctcg 


3515 


3534 1 


3 


SEQ ID NO: 


3451 


gaggaagggcaatgtggca 


700 


719 SEQ ID NO 


4792tgccagtctccatgacctc 


2476 


2495 1 


3 


SEQ ID NO: 


3452 


ggaagggcaatgtggcaao 


702 


72 1SEQ ID NO 


4793gttgctcttaaggacttcc 


13364 


13383 1 


3 


SEQ ID NO: 


3453 


gaagggcaatgtggcaaca 


703 


722 SEQ ID NO 


4794tgttgatgaggagtccttc 


1809 


1828 1 


3 


SEQ ID NO: 


3454 


caggcatcagcccacttgc 




79 6 S EQ ID NO 


4795gcaagtctttcctggcctg 


3019 


3038 1 


3 


SEQ ID NO: 


3455 


aggcatcagccoacttgot 


778 


797 SEQ ID NO 


4796agcaagtctttcctggcct 


3018 


3037 1 


3 


SEQ ID NO: 


3456 


tcagcccacttgctctcat 


783 


802 SEQ ID NO 


4797atgaaagtcaagcatctga 


12668 


12687 1 


3 


SEQ ID NO: 


3457 


gtcaactctgatcagcagc 


823 


842 S EQ ID NO 


4798 gctgactttaaaatctgac 


4819 


4838 1 


3 


SEQ ID NO: 


3458 


ggacgctaagaggaagcat 


865 


88 4 SEQ ID NO 


4799 atgcactgtttctgagtcc 


9339 


9358 1 


3 


SEQ ID NO: 


3459 


aaggagcaacacctcttcc 


902 


921 SEQ ID NO 


4800ggaatatcttagcatcctt 


13465 


13484 1 


3 


SEQ ID NO: 


3460 


aggagcaacacctcttcct 


903 


922 SEQ ID NO 


4801 aggaatatcttagcatcct 


13464 


13483 1 


3 


SEQ ID NO: 


3461 


caacacctcttcctgcctt 


908 


927 SEQ ID NO. 


4802aaggctgaotctgtggttg 


4292 


4311 1 


3 


SEQ ID NO: 


3462 


aacacctcttcctgcctit 


909 


928 S EQ ID NO 


4803aaagcaggccgaagctgtt 


1075 


1094 1 


3 
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SEQ !D NO: 


3463 


acaagaataagtatgggat 


933 952 SE Q, DNO : 


4804atccatgatctacatttgt 


6794 


6813 1 
1265 1 


3 


SEQ ID NO: 


3464 


caagaataagtatgggatg 


934 953 SEQ id N0 : 


4805catcactttacaagccttg 


1246 


3 


SEQ ID NO: 


3465 


tagcacaagtgacacagac 


954 973 S EQ ID NO: 


4806gtctcttcgttctatgcta 


4592 


4611 1 


3 


SEQ ID NO: 


3466 


agcacaagtgacacagact 


955 974 SE Q| DNO : 


4807agtctcttcgltctatgct 


4591 


4610 1 


3 


SEQ ID NO: 


3467 




956 975sEQ ID NO: 


4808 aagtgtagtctcotggtgc 


5099 
7987 


5118 1 


3 


SEQ ID NO: 


3468 


aacttgaagacacaccaaa 


978 997 SEQ ID NO: 


4809tttgaggattccatcagtt 


8006 1 


3 


SEQ ID NO: 


3469 


gcttctttggtgaaggtac 


1008 1027 SE Q| DN O: 


481 Ogtacctacttttggcaagc 


8372 


8391 1 


3 


SEQ ID NO: 


3470 


ctttggtgaaggtactaag 


1012 1031 SE Q|DNO: 


481 1 cttatgggatttcctaaag 


11167 


11186 1 


3 


SEQ ID NO: 


3471 


tactaagaagatgggcctc 


1024 1043 S EQIDNO: 


481 2gagggtagteataacagta 


10337 


10356 1 


3 


SEQ ID NO: 


3472 


tttgagagcacoaaatcca 


1046 1065 SE Q|DNO: 


481 3tggaagtgtcagtggcaaa 


10380 


10399 1 


3 


SEQ ID NO: 


3473 


agagcaccaaatccacatc 


1050 1069 SE Q|DNO: 


48 1 4 g atggatatgaccttctct 


4876 


4895 1 


3 


SEQ ID NO: 


3474 


agctgttttgaagactctc 


1087 1106s E Q|DNO: 


481 5gagaacatactgggcagct 


5880 


5899 1 
7131 ' 


3 


SEQ ID NO: 


3475 


tgaaaaaactaaccatctc 


1113 1132 SE Q|DNO: 


481 6gagaaaatcaatgcctlca 


7112 


3 


SEQ ID NO: 


3476 


gaaaaaactaaccatctct 


1114 1I33SEQIDMO: 


4817agagccaggtcgagctttc 


11052 


11071 1 


3 


SEQ ID NO: 


3477 


tctgagcaaaatatccaga 


1130 1149 SEQ id NO: 


481 8tctgatgaggaaactcaga 


12260 


12279 1 


3 


SEQ ID NO: 


3478 


tctcttcaataagctggtt 


1156 1175seqidNO: 


481 9aacctcccattttttgaga 


6326 


6345 1 


3 


SEQ ID NO: 


3479 


ctgagctgagaggcctcag 


1176 1195 SE Q|DNO: 




1367 


1386 1 


3 


SEQ ID NO: 


3480 


tgaagcagtcacatctctc 


1198 1217 SEQ |DNO: 


4821 gagaaaatcaatgccttca 


7112 


7131 


3 


SEQ ID NO: 


3481 
3482 


aagcagtcacatctctctt 


1200 1219 SE Q|DNO: 


4822aagaggcagcttctggott 


12297 


12316 


3 


SEQ ID NO: 


ctctcttgccacagctgat 


1212 1231 SEQ ID NO: 


4823 atcaaaagaagcccaagag 


12946 


12965 


3 


SEQ ID NO: 


3483 


tcttgccacagctgattga 


1215 1234 SE Q|DNO: 


4824tcaaagttaattgggaaga 


12279 


12298 


3 


SEQ ID NO: 


3484 


cttgccacagctgattgag 


1216 1235 SE QiDNO: 


4825ctcaattttgattttcaag 


8528 


8547 


3 


SEQ ID NO: 


3485 


tgaggtgtccagccccatc 


1231 1250SEQIDNO: 


4826gatggaaccctctccctca 


4733 


4752 


3 


SEQ ID NO: 


3486 


tcagtgtggacagcctcag 


1267 1286SEQIDNO: 


4827ctgacatcttaggcactga 


5001 


5020 


3 


SEQ ID NO: 


3487 


acatcctccagtggctgaa 


1296 1315 SEQ |DNO: 


4828ttcagaagctaagcaatgt 


7239 


7258 
12342 


3 


SEQ ID NO: 


3488 


gcacagcagctgcgagaga 


1385 1404 SEQ id NO: 


4829tctctgaaagacaacgtgc 


12323 


3 


SEQ ID NO: 


3489 


cagcagctgcgagagatct 


1388 1407 SE Q|DNO: 


4830agataacattaaacagctg 


13051 


13070 


3 


SEQ ID NO: 


3490 


gcgagggatcagcgcagcc 


1415 1434 SE Q| DNO : 


4831 ggctcaacacagacatcgc 


5718 


5737 


3 


SEQ ID NO: 


3491 


aagacaaaccctacaggga 


1478 1497SEQIDNO: 


4832tcccagaaaacclcttctt 


3936 


3955 


3 


SEQ ID NO: 


3492 


caggagctgctggacattg 


1499 1518SEQIDNO: 


4833caatggagagtccaacctg 


4660 


4679 


3 


SEQ ID NO: 


3493 


aggagctgctggacattgc 


1500 1519 SE Q |D NO: 


4834gcaagggttcactgttcct 


7864 


7883 


3 


SEQ ID NO: 


3494 


ctgctggacattgctaatt 


1505 1524 SE QlDNO: 


4B35aattgggaagaagaggcag 


12287 


12306 


3 


SEQ ID NO: 


3495 


gattacacctatttgattc 


1565 1584SEQIDNO: 


4836gaatattttgagaggaatc 


6353 
7161 


6372 


3 


SEQ ID NO: 


3496 


atttgattctgcgggtcat 


1575 1594 gE Q| DNO : 


4837atgaagtagaccaacaaat 


71 BO 


3 


SEQ ID NO: 


3497 


tctgcgggtcattggaaat 


1582 1601s E QIDNO: 


4838atttgtaagaaaatacaga 


6436 


6455 


3 


SEQ ID NO: 


3498 


aaccatggagcagttaact 


1609 1628 SE QIDNO: 


4839 agtttctccatcctaggtt 


9962 
8400 


9981 
8419 


1 3 


SEQ ID NO: 


3499 


ggagcagttaactccagaa 


1615 1634s E Q|DNO: 


4840ttctgaaaatccaatctcc 


1 3 


SEQ ID NO: 


3500 




1625 1644SEQIDNO: 


4841 aagatcgcagactttgagt 


11654 


11673 


1 3 


SEQ ID NO: 


3501 


tccagaactcaagtcttca 


1627 1646 SE Q| DNO : 


4842tgaactcagaagaattgga 


1920 


1939 


1 3 


SEQ ID NO: 


3502 


aagtacaaagccatcactg 


1663 1682s E Q|DNO: 


4843 cagtcatgtagaaaaactt 


4429 


4448 


1 3 


SEQ ID NO: 


3503 


gccatcactgatgatccag 


1672 1691 SE Q ID NO: 


4844 ctggaactctctccatggc 


10883 


10902 


1 3 


SEQ ID NO: 


3504 


ooatcactgatgatccaga 


1673 1692s E Q|DNO: 


4845tctgaactcagaaggatgg 








SEQ ID NO: 


3505 




1685 1704SEQIDNO: 


4846ggatttcctaaagctggat 


11173 


11192 


I 3 


SEQ ID NO: 


3506 


cagaaagctgccatccagg 


1688 1707 SE Q|DNO: 


4847 cotgaaatacaatgctctg 


5518 


5537 


1 3 


SEQ ID NO: 


3507 


acaaggaccaggaggttct 


1731 1750s E qidNO: 


4848 agaaacagcatttg tttgt 


4542 


4561 


1 3 


SEQ ID NO: 


3508 


aggaccaggaggttcttct 


1734 1753s E Q|DNO: 


4849 agaagctaagcaatgtcct 


7242 


7261 


1 3 


SEQ ID NO: 


.3509 


accaggaggttcttcttca 


1737 1756 SE Q|DNO: 


4850tgaaggctgactctgtggt 


4290 


4309 


1 3 


SEQ ID NO: 


3510 


tcttcagactttccttgat 


1750 1769 SEQ |DNO: 


4851 atcaggaagggctcaaaga 


2567 


2586 


1 3 


SEQ ID NO: 


3511 


ttcagactltccttgatga 


1752 1771 SEQ ID MO: 


4852tcattactcctgggctgaa 


11307 


11326 


1 3 


SEQ ID NO: 


3512 


gttgatgaggagtccttca 


1810 1829SEQ1DNO: 


4853tgaatctggctccctcaac 


9046 


9065 


1 3 
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SEQ ID NO: 


3513 


cttcacaggcagatattaa 


1824 


1843 SEQ ID NO 


: 4854ltaatcgagaggtatgaag 


7148 


7167 


1 3 


SEQ ID NO: 


3514 


ttcacaggcagatattaac 


1825 


1844 SEQ | D mo 


4855gttaatcgagaggtatgaa 


7147 


7166 


1 3 


SEQ ID NO: 


3515 


ggoagatattaacaaaatt 


1831 


1850 SEQ | D N0 


4856 aattgcattagalgatgcc 


6589 


6608 


1 3 


SEQ ID NO: 


3516 


atattaacaaaattgtcca 


1836 


1855 SEQ id N0 


4857tggagtttgtgacaaatat 


2760 


2779 


1 3 


SEQ ID NO: 


3517 


acaaaattgtccaaattct 


1842 


1861 SE Q| D NO 


4858 agaaacagcatttgtttgt 


4542 


4561 


1 3 


SEQ ID NO: 


3518 


gagcaagtgaagaactttg 


1877 


1896 SEQ id NO 


4859 caaatgacatgatgggctc 


5334 


5353 


1 3 


SEQ ID NO: 


3519 


gtgaagaactttgtggctt 


1883 


1902 SEQ id NO 


4860 aagcatctgattgactcac 


12677 


12696 


1 3 


SEQ ID NO: 


3520 


agaactttgtggcttccca 


1887 


1906 SEQ | D mo 


4861 tgggcctgccccagattct 


8909 


8928 


1 3 


SEQ ID NO: 


3521 


tttgtggcttcccatattg 


1892 


19 '"1SEQIDNO 


4862caataagatcaatagcaaa 


8998 


9017 


1 3 


SEQ ID NO: 


3522 


tggcttcccatattgccaa 


1896 


1915 SEQ |D NO 


4863ttggctcacatgaaggcca 


7631 


7650 


I 3 


SEQ ID NO: 


3523 


ttcccatattgccaatatc 


1900 


1919SEQIDNO 


4864gatatacactagggaggaa 


12745 


12764 


1 3 


SEQ ID NO: 


3524 


tcccatattgccaatatct 


1901 


1920 SEQ | DN O 


4865 agatcaaagttaattggga 


12276 


12295 


1 3 


SEQ ID NO: 


3525 


ttgccaatatcttgaactc 


1908 


1927 SEQ ID NO 


4866 gagtcccagtgcccagcaa 


9352 


9371 


1 3 


SEQ ID NO: 


3526 


ttggatatccaagatctga 


1934 


1953 SEQ id NO 


4867tcagtataagtacaaccaa 


9400 


9419 


3 


SEQ ID NO: 


3527 


tccaagatctgaaaaagtt 


1941 


1 96 °SEQIDNO 


4868aacttccaactgtcatgga 


1986 


2005 


3 


SEQ ID NO: 


3528 


ctgaaaaagttagtgaaag 


1949 


1968 SEQ id NO 


4869ctttgaagtcagtcttcag 


7915 


7934 


3 


SEQ ID NO: 


3529 


agttagtgaaagaagttct 


1956 


1975 SEQ ID NO 


4870agaatctcaacttccaact 


1978 


1997 


3 


SEQ ID NO: 


3530 


aatctcaacttccaactgt 


1980 


1999 SEQ id NO 


4871 acaggggtcctttatgatt 


12350 


12369 


3 


SEQ ID NO: 


3531 


gtcatggaottcagaaaat 


1997 2016 SEQ | DNO 


4872atttgaaagaataaatgac 


7036 


7055 


3 


SEQ ID NO: 


3532 


tcaactctacaaatctgtt 


2029 


2 °48 SE Q ID NO 


4873aacacattgaggctattga 


6978 


6997 


3 


SEQ ID NO: 


3533 


aactctacaaatctgtttc 


2031 


2050 SE Q ID NO 


4874gaaaaaggggattgaagtt 


10284 


10303 


3 


SEQ ID NO: 


3534 


aaatagaagggaatcttat 


2079 


2098 S EQ ID NO 


4875ataagcaaactgttaattt 


5457 


5476 


3 


SEQ ID NO: 


3535 


agaagggaatcttatattt 


2083 


2 102s E Q| DN o 


4876aaatgcactgctgcgttct 


4900 


4919 


3 


SEQ ID NO: 


3536 


gaagggaatcttatatttg 


2084 


2103 SE Q|DNO 


4877caaaaacattttcaacttc 


5287 


5306 


3 


SEQ ID NO: 


3537 


tgatccaaataactacctt 


2101 


2120 SEQ |D NO 


4878aaggaagaaagaaaaatca 


3461 


3480 


3 


SEQ ID NO: 


3538 


tggatttgcttcagctgac 


2158 


2177 SEQ ID NO 


4879gtcagcccagttccttcca 


10932 


10951 


3 


SEQ ID NO: 


3539 


tttgcttcagctgacctca 


2162 


21 81 SEQ ID NO 


4880tgaggaaactcagatcaaa 


12265 


12284 


3 


SEQ ID NO: 


3540 


cttggaaggaaaaggcttt 


2191 


2210SEQ ID NO 


4881 aaagcattggtagagcaag 


7850 


7869 


3 


SEQ ID NO: 


3541 


tggaaggaaaaggctttga 


2193 


2212 SE Q |D NO 


4882tcaagtctgtgggattcca 


4086 


4105 


3 


SEQ ID NO: 


3542 


ggctttgagccaacattgg 


2204 2223 seq1D no 


4883ccaagaggtatttaaagcc 


12958 


12977 


3 


SEQ ID NO: 


3543 


tgagccaacattggaagct 


2209 


2228 SEQ id N0 


4884agctttclgccactgctca 


13521 


13540 


3 


SEQ ID NO: 


3544 


gagccaacattggaagctc 


2210 


2229 SEQ |D NO 


4885gagctttctgccactgctc 


13520 


13539 


3 


SEQ ID NO: 


3545 


aacattggaagctcttttt 


2215 


2234SEQ ID NO 




4539 


4558 


3 


SEQ ID NO: 


3546 


tggaagctctttttgggaa 


2220 


2239 SE Q ID NO 


4887ttccggcacgtgggttcca 


3785 


3804 


3 


SEQ ID NO: 


3547 


ctctttttgggaagcaagg 


2226 


2245 SEQ | D N0 


4888 ccttactgactttgcagag 


7798 


7817 


3 


SEQ ID NO: 


3548 


tttttgggaagcaaggatt 


2229 


2248 SEQ id NO 


4889 aatcattgaaaaattaaaa 


6730 


6749 


3 


SEQ ID NO: 


3549 


ttttcccagacagtgtcaa 


2247 


2266 SEQ id NO 


4890ttgatgaaatcattgaaaa 


6723 


6742 1 


3 


SEQ ID NO: 


3550 


ttggctataccaaagatga 


2331 


2350 S EQ ID NO 


4891 tcattgctcccggagccaa 


2676 


2695 1 


3 


SEQ ID NO: 


3551 


ataccaaagatgataaaca 


2337 


2356 SEQ id NO 


4892tgttgcttttgtaaagtat 


6280 


6299 1 


3 


SEQ ID NO: 


3552 


gagcaggatatggtaaatg 


2357 


23 7 6SEQ ID NO 


4893 catttcagccttcgggctc 


4262 


4281 1 


3 


SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 


3553 
3554 
3555 


atggtaaatggaataatgc 


2366 
2367 


2385 SE Q id NO 
2386 SEQ , D NO 
2389SEQ ID NO 


4894gcalgcctagtttctccat 


9954 
10809 


9973 1 
10828 1 
13086 1 


3 


taaatggaataatgctcag 


2370 


4896ctgaaagagatgaaattta 


13067 


3 
3 


SEQ ID NO: 


3556 


tggaataatgctcagtgtt 


2374 


2393 S EQ ID NO 


4897aacagatttgaggattcca 


7981 


8000 1 


3 


SEQ ID NO: 


3557 


tcagtgttgagaagctgat 


2385 


2404 SEQ ID NO 


4898 atcacaactcctcoactga 


9542 


9561 1 


3 


SEQ ID NO: 


3558 


cagtgttgagaagctgatt 


2386 


2405 SEQ ID NO 


4899 aatcacaactcctccactg 


9541 


9560 1 


3 


SEQ ID NO: 


3559 


agtgttgagaagctgatta 


2387 


2406 SEQ | D no 


4900 taatcacaactcctccact 


9540 


9559 1 


3 


SEQ ID NO: 


3560 


gattaaagatttgaaatcc 


2401 


2420 SEQ , D no 


4901 ggatactaagtaccaaatc 


6874 


6893 1 


3 


SEQ ID NO: 


3561 


gatttgaaatocaaagaag 


2408 


2427 SE Q ID NO. 


4902 cttccgffiaccagaaatc 


8248 


8267 1 


3 


SEQ ID NO: 


3562 


atttgaaatccaaagaagt 


2409 2428 SE Q|DNO: 


4903acttccgtttaccagaaat 


8247 


8266 ■ 


3 
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SEQ ID NO: 


3563 


atccaaagaagtcccggaa 


2416 


2435 S EQ ID NO 


4g04ttccaatttccctgtggat 


3688 


3707 


1 3 


SEQ ID NO: 


3564 


tccaaagaagtcccggaag 


2417 


2436 SEQ | DNO 


: 4905cttccaatttccctgtgga 


3687 


3706 


1 3 


SEQ ID NO: 


3565 


agagcctacctccgcatct 


2438 


2457SEQ ID NO 


: 4906agaltaatccgctggctct 


8571 


8590 


1 3 


SEQ ID NO: 


3566 


gagcctacctccgcatctt 


2439 


2458 S EQ ID NO 


: 4907aagattaatccgctggctc 


8570 


8589 


1 3 


SEQ ID NO: 


3567 


cttgggagaggagcttggt 


2455 2474 SE Q|DNO 




12527 


12546 


1 3 


SEQ ID NO: 


3568 


ggagcttggttttgccagt 


2464 


2483 SEQ |D mo 


4909 actggtggcaaaaccctcc 


2734 


2753 


1 3 


SEQ ID NO: 


3569 


ttggttttgccagtctcca 


2469 


2488SEQ ID NO 


4910tggagaagccacactccaa 


10771 


10790 


1 3 


SEQ ID NO: 


3570 


cagtctccatgacctccag 


2479 


249 8SEQIDNO 


491 1 ctggtcgcctgccaaactg 


3538 


3557 


1 3 


SEQ ID NO: 


3571 


ctccatgacctccagctcc 


2483 


2502seq |D NO 


4912ggagtcattgctcccggag 


2672 


2691 


1 3 


SEQ ID NO: 


3572 


ctgggaaagctgcttctga 


2501 


2520 SE Q ID NO 


491 3tcagaaagctaccttccag 


7939 


7958 


1 3 


SEQ ID NO: 


3573 


gaggtcatcaggaagggct 


2561 


2580 SEQ id NO 


491 4agccagaagtgagatcotc 


3514 


3533 


1 3 


SEQ ID NO: 


3574 


aagaatgacttttttcttc 


2582 


26 °1SEQ ID NO 


491 5gaaggcatctgggagtctt 


3835 


3854 


I 3 


SEQ ID NO: 


3575 


cttftttcttcactacatc 


2590 


2609 SEQ id no 


491 6gatgcttacaacactaaag 


6107 


6126 


1 3 


SEQ ID NO: 


3576 


catcttcatggagaatgcc 


2605 


2624seq |D no 


491 7ggcaGttccaaaattgatg 


10718 


10737 


I 3 


SEQ ID NO: 


3577 


cttcatggagaatgccttt 


2608 


2627 SEQ ID NO 


491 8 aaagttaattgggaagaag 


12281 


12300 


1 3 


SEQ ID NO: 


3578 


aatgcctttgaactcccca 


2618 


2637 S EQ ID NO 


491 9tgggctggcttcagccalt 


5737 


5756 


1 3 


SEQ ID NO: 


3579 


gcctttgaactccccactg 


2621 


2640 SEQ id N0 


4920cagtctgaacattgcaggc 


5383 


5402 


3 


SEQ ID NO: 


3580 


caaggctggagtaaaactg 


2692 


2711 SEQ ID NO 


4921 cagtgcaacgaccaacttg 


5080 


5099 


3 


SEQ ID NO: 


3581 


tggagtaaaactggaagta 


2698 


2717 SEQ ID NO 


4922tactocaacgccagctcca 


3059 


3078 


3 


SEQ ID NO: 


3582 


ggaagtagccaacatgcag 


2710 


2729 SEQ ID NO 


4923ctgccatctcgagagttcc 


4106 


4125 


3 


SEQ ID NO: 


3583 


tttgtgacaaatatgggca 


2765 


2784 SEQ id no 


4924tgcctttgtgtacaccaaa 


11236 


11255 


3 


SEQ ID NO: 


3584 


tgtgacaaatatgggcatc 


2767 


278S SEQ | D no 


4925gatgggtctctacgccaca 


4385 


4404 


3 


SEQ ID NO: 


3585 


ggacttcgctaggagtggg 


2794 


2813 SEQ ID NO 


4926cccaaggccacaggggtcc 


12341 


12360 


3 


SEQ ID NO: 


3586 


gtggggtccagatgaacac 


2808 


2827 SEQ ID NO 


4927gtgttctagacctctccac 


4179 


4198 


3 


SEQ ID NO: 


3587 


ttccacgagtcgggtctgg 


2834 


2 353 SEQ id NO 


4928coagaatctgtaccaggaa 


12562 


12581 


3 


SEQ ID NO; 


3588 


agtcgggtctggaggctca 


2841 


2860 SEQ | D N0 


4929tgagaactacgagctgact 


4807 


4826 


3 


SEQ ID NO: 


3589 


tcgggtctggaggctcatg 


2843 


2862 SE q | D no 


4930catgaaggccaaaltccga 


7639 


7658 


3 


SEQ ID NO: 


3590 


aaaagctgggaagctgaag 


2869 


2888 SEQ ID NO 


4931 cttccagacacctgatttt 


7951 


7970 


3 


SEQ ID NO: 


3591 


aagctgaagtttatcattc 


2879 


2898 SE Q ID NO 


4932gaatttacaattgttgctt 


6269 


6288 


3 


SEQ ID NO: 


3592 


gagaccagtcaagctgctc 


2908 


2927 SEQ ID NO 


4933gagcttcaggaagcttctc 


13214 


13233 


3 


SEQ ID NO: 


3593 


gcaacacattacatttggt 


2934 


2953 SEQ ID NO 


4934accagtcagatattgttgc- 


10191 


10210 


3 


SEQ ID NO: 


3594 


acattacatttggtctcta 


2939 


29 58 S EQ ID NO 


4935tagaatatgaactaaatgt 


11889 


11908 


3 


SEQ ID NO: 


3595 


cattacatttggtctctac 


2940 


29 59SEQ ID NO 


4936gtagctgagaaaatcaatg 


7106 


7125 


3 


SEQ ID NO: 


3596 


aaacggaggtgatcccacc 


2964 2983SEQIDNO 


4937 ggtggataccctgaagttt 


3205 


3224 


3 


SEQ ID NO: 


3597 


attgagaacaggcagtcct 


2987 3006 S EQIDNO 


4938 aggaaaagcgcacctcaat 


12031 


12050 


3 


SEQ ID NO: 


3598 


tgagaacaggcagtcctgg 


2989 


3008 SEQ |D NO 


4939ccagcttccccacatctca 


8341 


8360 


3 


SEQ ID NO: 


3599 


ctgcacctcaggcgcttac 


3043 


3062 SEQ id NO 


4940gtaagaaaatacagagcag 


6440 


6459 


3 


SEQ ID NO: 


3600 


tccacagactccgcctcct 


3074 


3093 SEQ id NO 


4941 aggacagagcctlggtgga 


3192 


3211 


3 


SEQ ID NO: 


3601 


ctgaccggggacaccagat 


3101 


3120SEQ ID NO 


4942atctgatgaggaaactcag 


12259 


12278 1 


3 


SEQ ID NO: 


3602 


tagagctggaactgaggco 


3120 


31 39SEQ ID NO 


4943 ggcctctctggggcatcta 


5144 


5163 1 


3 


SEQ ID NO: 


3603 


ctatgagctccagagagag 


3175 


3194 S EQ ID NO 


4944 ctctcacaaaaaagtatag 


6549 


6568 1 


3 


SEQ ID NO: 


3604 


cttggtggataccctgaag 


3202 


3221 S EQ ID NO 


4945 cttcaggaagcttctcaag 


13217 


13236 1 


3 


SEQ ID NO: 


3605 


ttgtaactcaagcagaagg 


3222 


3241 seq id no 


4946 ccttacacaataatcacaa 


9530 


9549 1 


3 


SEQ ID NO: 


3606 


taactcaagcagaaggtgc 


3225 


3244 SEQ ID NO 


4947gcacctagctggaaagtta 


6955 


6974 1 


3 


SEQ ID NO: 


3607 


gcagaaggtgcgaagcaga 


3233 


3252 SEQ ID NO 


4948tctgtgggattccatctgc 


4091 


4110 1 


3 


SEQ ID NO: 


3608 


cagaaggtgcgaagcagao 


3234 


3253 SEQ ID NO 


4949gtctgtgggattccaictg 


4090 


4109 1 


3 


SEQ ID NO: 


3609 


gtatgaccttgtccagtga 


3288 


3307 SEQ ID NO 




10851 


10870 1 


3 


SEQ ID NO: 


3610 




3289 


3308 SEQ ID NO 


4951 ttcaccaacggagaacata 


10850 


10869 1 


3 


SEQ ID NO: 


3611 


gaagtccaaattccggatt 


3305 


3324SEQ ID NO 




10052 


10071 1 


3 


SEQ ID NO: 


3612 


gagggcaaaacgtcttaca 


3371 


3390 SEQ | D N0 


4953tgtacaaotggtccgcotc 


4215 


4234 1 


3 
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SEQ ID NO: 


3813 


agggcaaaacgtcttacag 


3372 


3391 SEQ ID NO 


: 4954ctgttaggacaccagccct 


4062 


4081 


1 3 


SEQ ID NO: 


3S14 


gactcaccctggacattca 


3390 


3409 SE QIDNO 


4955tgaaattcaatcacaagtc 


9076 


9095 


1 3 


SEQ ID NO: 


3615 


ctggacattcagaacaaga 


3398 


3417 SEQ | D no 


4956tcttttcttttcagoccag 


9226 


9245 


1 3 


SEQ ID NO: 


3616 


tcatgggcgacctaagttg 


3435 


3454 SEQ | D no 


4957caactgcagacatatatga 


6635 


6654 


1 3 


SEQ ID NO: 


3617 


tgggcgacctaagttgtga 


3438 


3457 SEQ ID NO 


4958tcaotccattaacctccca 


6316 


6335 


1 3 


SEQ ID NO: 


3618 


agttgtgacacaaaggaag 


3449 


3468 SEQ id no 


4959cttcttttccaattgaact 


13838 


13857 


1 3 


SEQ ID NO: 
SEQ ID NO: 


3619 
3620 


tgacacaaaggaagaaaga 
gacacaaaggaagaaagaa 


3454 
3455 


3473 SEQ id NO 
3474 SEQ id WO 


4960tcHcatcttcatctgtca 
4961ttcttcatcttcatctgtc 


10220 
10219 


10239 
10238 


1 3 
1 3 


SEQ ID NO: 
SEQ ID NO: 


3621 
3622 


ggaagaaagaaaaatcaag 
aaaatcaagggtgttattt 


3463 3482 S EQIDNO 
3473 3492 SEQ | DNO 


4962cttgtcatgcctacgttcc 
4963aaatcttattggggatttt 


11348 
7084 


11367 
7103 


1 3 
1 3 


SEQ ID NO: 


3623 


tccataccccgttlgcaag 


3491 


3510 SEQIDNO 


4964cttggattcaaaatgtgga 


6858 


6877 


3 


SEQ ID NO: 


3624 


tgcaagcagaagccagaag 


3504 


3523 S EQ ID NO 


4965 cttcagggaacacaatgca 


5185 


5204 


3 


SEQ ID NO: 


3625 


cagaagccagaagtgagat 


3510 


3529 SEQ , D mo 


4966 atctatgccatctcttctg 


5633 


5652 


3 


SEQ ID NO: 


3626 


tgagatcctcgcccactgg 


3523 


3542 S EQ ID NO 


4967 ccagcttccccacatctca 


8341 


8360 




SEQ ID NO: 


3627 


ggtcgcctg ccaaactgct 


3540 


3559 SEQ id no 


4968 agcacalatgaactggacc 


13947 


13966 


3 


SEQ ID NO: 


3628 


tgcttctccaaatggactc 


3555 


3574 SEQ ID NO 


4969 gagtttatcagtcagagca 


9701 


9720 


3 


SEQ ID NO: 


3629 


tggactcatctgotacagc 


3567 


3586 SE Q id NO 


4970gctgcagtggcccgttcca 


8167 


8186 


3 


SEQ ID NO: 


3630 


gctacagcttatggctcca 


3578 


359 ?SEQ ID NO 


4971 tggaggacattcctctagc 


8211 


8230 


3 


SEQ ID NO: 


3631 


ggtggcatggcattatgat 


3610 


3629 SEQ ID NO 


4972 atcacaaattagtttcacc 


8947 


8966 


3 


SEQ ID NO: 


3632 


agagaagattgaatttgaa 


3631 


3650 SE Q ID NO 


4973ttcaacgatacctgtctct 


7713 


7732 


3 


SEQ ID NO: 


3633 


caggcaccaatgtagatac 


3657 


3676 SEQ ID NO 


4974gtatgctaatagactcctg 


3736 


3755 


3 


SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 


3634 


gacttccaatttccctgtg 
gtccctcaaacagacatga 


3685 
3764 
3770 


3704 SEQ id NO 
37 S 3 SEQ ID NO 
3789 SEQ ID NO 


4975cacaatgcaaaattcagtc 


5195 
12777 
7022 


5214 
12796 
7041 


3 


3635 
3636 


caaacagacatgactttcc 


4976tcataagggaggtagggac 
4977ggaactacaatttcattfg 


3 
3 


SEQ ID NO: 


3637 


atagttgcaatgagctcat 


3809 


3828 SEQ ID NO 


4978atgatttgaaaatagctat 


6693 


6712 


3 


SEQ ID NO: 


3638 


gcttcagaaggcatctggg 


3829 


3848 SEQ id NO 


4979cccaagaggtatttaaagc 


12957 


12976 


3 


SEQ ID NO: 


3639 


ggagttcaacctccagaac 


3895 


3914 SEQ ID NO 


4980gttcactcoatlaacctcc 


6314 


6333 


3 


SEQ ID NO: 


3640 


agaaaacctcttcttaaaa 


3940 


3959 S EQ ID NO 


4981 ttttctaaatggaacttct 


12173 


12192 


3 


SEQ ID NO: 


3641 


aaaacctcttcttaaaaag 


3942 


3961 SEQ ID NO 


4982ctttgaaaaattctcttlt 


9213 


9232 


3 


SEQ ID NO: 


3642 


aaaaagcgatggccgggtc 


3955 


3974 SE Q ID NO 


4983gaccttgcaagaatatftt 


6343 


6362 


3 


SEQ ID NO: 


3643 


gtcaaatataccttgaaca 


3971 


399 °SEQ ID NO 


4984tgttaacaaattccttgac 


7363 


7382 


3' 


SEQ ID NO: 


3644 


tgaacaagaacagtttgaa 




4003 SE q id NO 


4985ttcaagttcctgaccttca 


8310 


8329 


3 


SEQ ID NO: 


3645 


agtttgaaaattgagattc 


3995 


4014 SEQ ID NO 


4986gaatctggctccctcaact 


9047 


9066 1 


3 


SEQ ID NO: 


3646 


gtttgaaaattgagattcc 


3996 


40-| 5SEQIDNO 


4987ggaaataccaagtcaaaac 


10454 


10473 1 


3 


SEQ ID NO: 


3647 


ttgaaaattgagattcctt 


3998 


4 °1 7 SEQIDNO 


4988aaggaaaagcgcacctcaa 


12030 


12049 1 


3 


SEQ ID NO: 


3648 


ctaaagatgttagagactg 


4046 


4 °65 SE Q ID NO 


4989cagttgaccacaagcttag 


10545 


10564 1 


3 


SEQ ID NO: 


3649 


atgttagagactgttagga 


4052 


407 1SEQ ID NO 


4990 tccttaacaccttccacat 


8073 


8092 1 


3 


SEQ ID NO: 


3650 


cagccctccacttcaagtc 


4074 


4 09 3 SEQ ID NO 


4991 gacttctctagtcaggctg 


8813 


8832 1 


3 


SEQ ID NO: 


3651 


agccctccacttcaagtct 


4075 


4094 SEQ id no 


4992 agacatcgctgggctggct 


5728 


5747 1 


3 


SEQ ID NO: 


3652 


ccatctgccatctcgagag 


4102 


4121 SEQ ID NO 


4993 ctctoaaatgacatgatgg 


5330 


5349 1 


3 


SEQ ID NO: 


3653 


attcccaagttgtatcaac 


4142 


4161SEQ ID NO 


4994gttgagaagccccaagaat 


6254 


6273 1 


3 


SEQ ID NO: 






4156 


41 7 5SEQ ID NO 


4995 gagatcaagacactgttga 


8843 


8862 1 


3 


SEQ ID NO: 


3655 


ggtgttctagacctctcca 


4178 


4197 SEQ ID NO 


4996tggaaccctctccctcacc 


4735 


4754 1 


3 


SEQ ID NO: 


3656 


ctccacgaatgtctacagc 


4192 


4211 SEQ ID NO 


4997 gctggtaacctaaaaggag 


5588 


5607 1 


3 


SEQ ID NO: 


3657 


cacgaatgtctacagcaac 


4195 


42 1 4 SEQ ID NO 


4998 gttgcccaccatcatcgtg 


11671 


11690 1 


3 


SEQ ID NO: 


3658 


acgaatgtctacagcaact 


4196 


42 15SEQIDNO 


4999 agttgcocacoatcatcgt 


11670 


11689 1 


3 


SEQ ID NO: 


3659 


tcctaoagtggiggcaaca 


4232 


4251 SEQ ID NO 


5000tgttagttgctcttaagga 


13359 


13378 1 


3 


SEQ ID NO: 


3660 


cgttaccacatgaaggctg 


4280 


4 299SEQ ID NO: 


5001 cagcaagtacctgagaacg 


8611 


8630 1 


3 


SEQ ID NO: 


3661 


gaaggctgactctgtggtt 


4291 


4310 SEQIDNO: 


5002aacctatgcottaatcttc 


13169 


13188 1 


3 


SEQ ID NO: 


3662 


tgtggttgacctgctttcc 


4303 


4322 SE Q ID NO: 




6965 


6984 1 


3 
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SEQ ID NO: 


3663 


cctgctttcctacaatgtg 


4312 


4331 SEQ ID NO 


5004cacaccttgacattgcagg 


11088 


11107 


3 


SEQ ID NO: 


3664 


ctgctttcctacaatgtgc 


4313 


4332 SEQ id |sjo 


5005 gcacaccttgacattgcag 


11087 


11106 


3 


SEQ ID NO: 


3665 


tcctacaatgtgcaaggat 


4319 


4338 SEQ ID NO 


5006 atccgctggctctgaagga 


8577 


8596 


3 


SEQ ID NO: 


3666 


tatgaccacaagaatacgt 


4352 


4371 SEQ ID NO 


5007 acgtccgtgtgccttcata 


9984 


10003 


3 


SEQ ID NO: 


3667 


atgaccacaagaatacgtc 


4353 


4372seq |D NO 


5008gacgtccgtgtgccttcat 


9983 


10002 


3 


SEQ ID NO: 


3668 


gaatacgtctacactatca 


4363 


4382seq ID NO 


5009 tgattatctgaattcattc 


6487 


6506 


3 


SEQ ID NO: 


3669 


tttctagattcgaatatca 


4406 


4425 seq |D MO 


5010tgatttacatgatttgaaa 


6685 


6704 


3 


SEQ ID NO: 




gattcgaatatcaaattca 


4412 


4431 gEQ |D NO 




7102 


7121 


3 


SEQ ID NO: 


3671 


gaaacaacccagtctcaaa 


4449 


4468 SEQ ID NO 


5012fflgaaaaattcfctttta 


9214 


9233 


3 


SEQ ID NO: 


3672 


cccagtctcaaaaggttta 


4456 


4475SEQ ID NO 


5013taaattcattactoctggg 


11302 


11321 


3 


SEQ ID NO: 


3673 


ctcaaaaggtttactaata 


4462 


4481 SEQ ID NO 


5014 tattcaaaactgagttgag 


12231 


12250 


3 


SEQ ID NO: 


3674 


tcaaaaggtltactaatat 


4463 


4482seq ID NO 


5015atattcaaaactgagttga 


12230 


12249 


3 


SEQ ID NO: 


3675 


aaaaggtttactaatattc 


4465 


4484gEQ id NO 


5016gaatttgaaagttcgtttt 


9280 


9299 


3 


SEQ ID NO: 




gaaacagcatttgtttgtc 


4543 


4562sEQ ID NO 


5017gacagcatcttcgtgtttc 


11214 


11233 


3 


SEQ ID MO: 


3677 


atttgtttgtcaaagaagt 


4551 


4570seQ ID NO 


5018acttaaaaaatataaaaat 


8022 


8041 


3 


SEQ ID NO: 




tcaagattgatgggcagtt 


4569 


4588 SEQ ID NO 




13422 


13441 


3 


SEQ ID NO: 


3679 


ttoagagtctcttcgttct 


4585 


4605seq id NO 


5020 agaagatggcaaalttgaa 


11995 


12014 


3 


SEQ ID NO: 


3680 


cagagtctcttcgttctat 


4588 


4607SEQ ID NO 


5021 atagcatggacttcttctg 


8873 


8892 


3 


SEQ ID NO: 


3681 


atgctaaaggcacatatgg 


4605 


4624SEQ ID NO 


5022 ccatttgagatcacggcat 


9245 


9264 


3 


SEQ ID NO: 


3682 


gcacatatggcctgtcttg 


4614 


4633seq ID NO 


5023 caagttggcaagtaagtgc 


9372 


9391 


3 


SEQ ID NO: 


3683 


gagtccaacctgaggttta 


4667 


4686gEQ ]□ no 


5024taaagtgccacttttactc 


6190 


6209 


3 


SEQ ID NO: 


3684 


agtccaacctgaggtttaa 


4668 


4687gEQ ID NO 


5025ttaacagggaagatagact 


9308 


9327 


3 


SEQ ID NO: 


3685 


cctacctccaaggcaccaa 


4692 


471 1 SEQ ID NO 


5026ttggcaagtaagtgctagg 


9376 


9395 


3 


SEQ ID NO: 


3686 


gaagatggaaccctctccc 


4730 


4749SEQ ID NO 


5027 gggaagaagaggcagcttc 


12291 


12310 


3 


SEQ ID NO: 


3687 


tgatctgcaaagtggcatc 


4762 


4781 SEQ ID NO 


5028 gatgaggaaactcagatca 


12263 


12282 


3 


SEQ ID NO: 


3688 


gatctgcaaagtggcatca 


4763 


4782gEQ |D NO 


5029tgatgaggaaactcagafc 


12262 


12281 


3 


SEQ ID NO: 


3689 


gcttccctaaagtatgaga 


4793 


4812SEQ ID NO 


5030tctcgtgtctaggaaaagc 


5977 


5996 


3 


SEQ ID NO: 


3690 


gtatgagaactacgagctg 


4804 


4823 SEQ ID NO 


5031 cagcttaagagacacatac 


6920 


6939 


3 


SEQ ID NO: 




tctaacaagatggatatga 


4868 


4887 SEQ ID NO 


5032tcatttfccaactaataga 


13032 


13051 


3 


SEQ ID NO: 


3692 


ctgctgcgttctgaatatc 


4907 


4926seq id NO 


5033 gatacaagaaaaactgcag 


6901 


6920 


3 


SEQ ID NO: 


3693 


tcattgaggttcttcagcc 


4940 


4959 SEQ id NO 


5034ggctcatatgctgaaatga 


5348 


5367 


3 


SEQ ID NO: 


3694 


ttctggatcactaaattcc 


4963 


4982 SEQ ID NO 


5035 ggaaggacaaggcccagaa 


12549 


12568 


3 


SEQ ID NO: 


3695 


ccatggtcttgagttaaat 


4981 


5000SEQ ID NO 


5036 atttttattcctgccatgg 


10103 


10122 


3 


SEQ ID NO: 


3696 


tcttaggcactgacaaaat 


5007 


5026SEQ ID NO 


5037attttttgcaagttaaaga 


14019 


14038 


3 


SEQ ID NO: 


3697 


acaaggcgacactaaggat 


5040 


5059seq id NO 


5038 atccatgatctacatttgt 


6794 


6813 


3 


SEQ ID NO: 


3698 


tgcaacgaccaacttgaag 


5083 


5102 SEQIDNO 


5039 cttcagggaacacaatgca 


5185 


5204 


3 


SEQ ID NO: 


3699 




5092 


51 11 SEQ ID NO 


5040 gagatgagagatgccgttg 


6239 


6258 


3 


SEQ ID NO: 


3700 


gctggagaatgagctgaat 


5116 


5135SEQ ID NO 


5041attctcttttctttlcagc 


9222 


9241 


3 


SEQ ID NO: 


3701 


gcagagcttggcctctctg 


5135 


5154seq ID NO 


5042cagatacaagaaaaactgc 


6899 


6918 


3 


SEQ ID NO: 


3702 


tctctggggcatctatgaa 


5148 


5167 SEQ ID NO 


5043ttcattcaattgggagaga 


6499 


6518 


3 


SEQ ID NO: 


3703 


tctggggcatctatgaaat 


5150 


5169 SEQIDNO 


5044 atttgtaagaaaatacaga 


6436 


6455 


3 


SEQ ID NO: 


3704 


aacacaatgcaaaattcag 


5193 


5212 S EQ ID NO 


5045ctgaagcattaaaactgtt 


7506 


7525 


3 


SEQ ID NO: 


3705 


ctcacagagctatcactgg 


5231 


5250 SEQ ID NO 


5046 ccagatgctgaacagtgag 


8149 


8168 


3 


SEQ ID NO: 


3706 


tgggaagtgcttatcaggc 


5247 


5266 S EQ ID NO 


5047 gcctacgttccatgtccca 


11356 


11375 


3 


SEQ ID NO: 


3707 


ttcaaggtcagtcaagaag 


5303 


5322 SEQ ID NO 


5048 cttcagtgcagaatatgaa 


11977 


11996 


3 


SEQ ID NO: 


3708 


aatgacatgatgggctcat 


5336 


5355 S EQ ID NO 


5049atgattatctgaattcatt 


6486 


6505 


3 


SEQ ID NO: 


3709 


gctcatatgctgaaatgaa 


5349 


5368 S EQ ID NO 


5050ttcagccattgacatgagc 


5746 


5765 


3 


SEQ ID NO: 


3710 


atatgctgaaatgaaattt 


5353 


5372SEQ ID NO 


5051 aaatagctattgctaatat 


6702 


6721 


3 


SEQ ID NO: 


3711 


tctgaacattgcaggctta 


5386 


5405 SEQ id NO 


5052taagaaccagaagatcaga 


10996 


11015 


3 


SEQ ID NO: 


3712 




5389 


5408 SEQ id NO 


5053tgatatcgacgtgaggttc 


12490 


12509 


3 
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SEQ ID NO: 


3713 


tgcaggcttatcactggac 


5395 


5414 SEQ ID NO 


SEQ ID NO: 


3714 


tcaaaacttgacaacattt 


5420 


5439 SEQ ID NO 


SEQ ID NO: 


3715 


atttacagctctgacaagt 


5435 


5454 SEQ | D no 


SEQ ID NO: 


3716 


ctctgacaagttttataag 


5443 


5462SEQ ID NO 


SEQ ID NO: 


3717 


gttaatttacagctacagc 


5468 


5487 SEQ ID NO 


SEQ ID NO: 


3718 


ttctctggtaactacttta 


54,91 


5510SEQIDNO 


SEQ ID NO: 


3719 


cctaaaaggagcctaccaa 


5596 


56 15sEQIDNO 


SEQ ID NO: 


3720 


aaaaggagcotaccaaaat 


5599 


56 1 8 SEQ ID NO 


SEQ ID NO: 


3721 


aggagoctaccaaaataat 


5602 


5621 SEQ .ID NO 


SEQ ID NO: 


3722 


ataatgaaataaaacacat 


5616 


5635 SEQ | D N0 


SEQ ID NO: 


3723 


aaaacacatatatgccatc 


5626 


5645 SEQ | D mo 


SEQ ID NO: 


3724 


tgctaaggttcagggtgtg 


5686 


5705 S EQ ID NO 


SEQ ID NO: 


3725 


gagtttagccatcggctca 


5705 


5724 SEQ id NO 


SEQ ID NO: 


3726 


gctggcttcagccattgac 


5740 


5759 SEQ id no 


SEQ ID NO: 


3727 


atttcagcaatgtcttccg 


5790 


5809 SE Q |D NO 


SEQ ID NO: 


3728 


tttcagcaatgtcttccgt 


5791 


58 1°SEQIDNO 


SEQ ID NO: 


3729 


ttcagcaatgtcttccgtt 


5792 


58 1 1 SEQ ID NO 


SEQ ID NO: 


3730 


cagcaatgtcttccgttct 


5794 


5813 SEQiDNO 


SEQ ID NO: 


3731 


tgtcttccgttctgtaatg 


5800 


5819 SEQ ID NO 


SEQ ID NO: 


3732 


gtcttccgttctgtaatgg 


5801 


5820 SE Q id NO 


SEQ ID NO: 


3733 


atgggaaactcgctctctg 


5859 


5878 SEQIDNO 


SEQ ID NO: 


3734 


ggagaacatactgggcagc 


5879 


5898 SEQ id NO 


SEQ ID NO: 


3735 


gttgaaagcagaacctctg 


5914 


5933 SEQ id no- 


SEQ ID NO: 


3736 


gtctaggaaaagcatcagt 


5983 


S002 SEQ ID NO 


SEQ ID NO: 


3737 


agcatcagtgcagctcttg 


5993 


B012 SE Q|DNO 


SEQ ID NO: 


3738 


ttgaacacaaagtcagtgc 


6009 


6028 SEQ id NO 


SEQ ID NO: 


3739 


gcagacaggcacctggaaa 


6046 


6065 S EQ ID NO: 


SEQ ID NO: 


3740 


gaaactcaagacccaattt 


6061 


6080 S EQ ID NO: 


SEQ ID NO: 


3741 


acaatgaatacagccagga 


6084 


6103SEQ ID NO: 


SEQ ID NO: 


3742 


cttggatgcttacaacact 


6103 


6122 SE Q ID NO: 


SEQ ID NO: 


3743 


ttggcgtggagcttactgg 


6132 


6 1 51 SEQ ID NO: 


SEQ ID NO: 


3744 


cactlttactcagtgagcc 


6198 


6217SEQIDNO: 


SEQ ID NO: 


3745 


tttagagatgagagatgcc 


6235 


6254 SE Q ID NO: 


SEQ ID NO: 


3746 


gagaagccccaagaattta 


6257 


627S SEQ id NO: 


SEQ ID NO: 


3747 


caattgttgcttttgtaaa 


6276 


6295SEQ ID NO: 


SEQ ID NO: 


3748 


ttttgtaaagtatgataaa 


6286 


6305 SE Q |D NO: 


SEQ ID NO: 


3749 


ttgtaaagtatgataaaaa 


6288 


6307 SEQ | D NO: 


SEQ ID NO: 


3750 


ttcactccattaacctccc 


6315 


6334 SEQ id N0: 


SEQ ID NO: 


3751 


ttttgagaccttgcaagaa 


6337 


6356 SE Q id NO: 


SEQ ID NO: 


3752 


accttgcaagaatattttg 


6344 


6363 SEQ ID NO: 


SEQ ID NO: 


3753 


tcaatattgatcaattlgt 


6423 


6442 SEQ ID NO: 


SEQ ID NO: 








6470seq ID NO: 


SEQ ID NO: 


3755 


cctgggaaaactcccacag 


6460 


6479 SEQ ID NO: 


SEQ ID NO: 


3756 




6469 


6488 SEQ ID NO: 


SEQ ID NO: 


3757 


aattcattcaattgggaga 


6497 


6516SEQ ID NO: 


SEQ ID NO: 


3758 


ttcaattgggagagacaag 


S503 


6522 SEQ id no: 


SEQ ID NO: 


3759 


aggagaaactgactgctct 


6534 


6553 SEQ id NO: 


SEQ ID NO: 


3760 


actgactgctctcacaaaa 


6541 


6560 SEQ , D NO: 


SEQ ID NO: 


3761 


gactgctctcacaaaaaag 


6544 


6563 SEQ ID NO: 


SEQ ID NO: 


3762 


cagacatatatgatacaat 


6641 


6660 SEQ | D NO: 
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5054gtcctggattccacatgca 


11852 


11871 


1 3 


5055aaattccttgacatgttga 


7370 


7389 


1 3 


5056 acttaa aaaatataaaaat 


8022 


8041 


1 3 


5057cttacttgaattccaagag 


10674 


10693 


1 3 


5058 gctgcatgtggctggtaac 


5578 


5597 


1 3 


5059taaaagattactttgagaa 


7275 


7294 


1 3 


5060ttggcaagtaagtgctagg 


9376 


9395 


1 3 


5061atttacaattgttgctttt 


6271 


6290 


3 


5062 attacctatgatttctcct 


10127 


10146 


3 


5063 atgtcaaacactttgttat 


7065 


7084 


3 


5064gatgaagatgacgactttt 


12158 


12177 


3 


5065 cacaagtcgattcccagca 


9087 


9106 


3 


5066tgaggtgactcagagactc 


7450 


7469 


3 


5067gtcagtgaagttctccagc 


8596 


8615 


3 


5068cggagcatgggagtgaaat 


8628 


8647 


3 


5069acggagcatgggagtgaaa 


8627 


8646 


3 


5070aacggagcatgggagtgaa 


8626 


8645 


3 


5071 agaagtglcttcaaagctg 


12412 


12431 


3 


5072cattcaattgggagagaca 


6501 


6520 


3 


5073ccattcagtctctcaagac 


12975 


12994 


3 


5074cagataaaaaactcaccat 


12213 


12232 


3 


5075gctgttttgaagactctcc 


1088 


1107 


3 


5076cagaattcataatcccaac 


8274 


8293 


3 


5077actgcaagatttttcagac 


13612 


13631 


3 


5078caagaacctgttagttgct 


13351 


13370 


3 


5079 gcacatcaatattg atcaa 


6418 


6437 


3 


5080tttcagatggcattgctgc 


11610 


11629 


3 


5081 aaatcccatccaggttttc 


8037 


8056 


3 


5082tcctttggctgtgctttgt 


9682 


9701 


3 


5083ag1gaagttctccagcaag 


8599 


8618 


3 


5084ccagaatfcataatcccaa 


8273 


8292 


3 


5085ggctattgatgttagagtg 


6988 


7007 


3 


5086 ggcatgatgctcatttaaa 


9177 


9196 


3 


5087taaagccattcagtctctc 


12970 


12989 


3 


5088tttaaccagtcagatattg 


10187 


10206 


3 


5089tttattgctgaatccaaaa 


13655 


13674 


3 


5090ttttgagaggaatcgacaa 


6358 


6377 


3 


5091 gggaaaaaacaggottgaa 


9576 


9595 1 


3 


5092ttctctctatgggaaaaaa 


9566 


9585 1 


3 


5093 caaaagaagcccaagaggt 


12948 


12967 


3 


5094acaaagcagatfatgttga 


11829 


11848 1 


3 


5095ttttcagaccaactctctg 


13622 


13641 1 


3 


5096 ctgtctctggtcagccagg 


7724 


7743 1 


3 


5097attacacttcctttcgagt 


12869 


12888 1 


3 


5098 tctcttcctccatggaatt 


10479 


10498 1 


3 


5099cttggagtgccaglttgaa 


11808 


11827 1 


3 


51 OOagagcttatgggatttcct 


11163 


11182 1 


3 


5101ttttggcaagctatacagt 


8380 




3 


5102ctttgtgagtttatcagtc 




9714 1 


3 


51 03attggatatccaagatctg 


1933 


1952 1 


3 
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SEQ ID NO: 


3763 


aatttgatcagtatattaa 


6657 


6676 SEQ id NO 




1 04ttaaaagaaatcttcaatt 


13815 


13834 


1 3 


SEQ ID NO: 


3764 


tatgatttacatgatttga 


6683 


6702 SEQ id mo 




1 05tcaatgattatatcccata 


13128 


13147 


1 3 


SEQ ID NO: 


3765 


tttgaaaatagctattgct 


6697 


6716 SEQ |D NO 




1 06agcacagaaaaaattcaaa 


13864 


13883 


1 3 


SEQ ID NO: 


3766 


ttgaaaatagctattgcta 


6698 


6717 SE Q |D NO 




1 07tagcacagaaaaaattcaa 


13863 


13882 


1 3 


SEQ ID NO: 


3767 


aatagctattgctaatatt 


6703 


6722 SEQID N0 




1 08 aataaatggagtctttatt 


14084 


14103 


1 3 


SEQ ID NO: 


3768 


attattgatgaaatcattg 


6719 


6738 SEQ id NO 




1 09caataccagaattcataat 


8268 


8287 


1 3 


SEQ ID NO: 


3769 


aaagtcttgatgagcacta 


6747 


6'SSseq ID NO 




110tagtgattacacttccttt 


12864 


12883 


1 3 


SEQ ID NO: 


3770 


aagtcttgatgagcactat 


6748 


6767 SEQ ID NO 




1 1 1 atagcaacactaaatactt 


8769 


8788 


1 3 


SEQ ID NO: 


3771 


ttgatgagcactatcatat 


6753 


6772 SE Q ID NO 




112 atatccaagatgagatcaa 


13101 


13120 


1 3 


SEQ ID NO: 


3772 


taattttagtaaaaacaat 


6777 


6796 SEQ | D N0 




1 13attgagattccctccatta 


11702 


11721 


1 3 


SEQ ID NO: 


3773 


ttttagtaaaaacaatcca 


6780 


67"SEQ ID NO 




1 1 4 tggaglgccagtttgaaaa 


11810 


11829 


1 3 


SEQ ID NO: 


3774 


acatttgtttattgaaaat 


6805 


6824 SE Q ID NO 




115atttcctaaagctggatgt 


11175 


11194 


l 3 


SEQ ID NO: 








6843seq |D NO 












SEQ ID NO: 


3776 


attttaacaaaagtggaag 


6828 


6847 SEQ id N0 


5 


1 1 7 cttcaaagacttaaaaaat 


8014 


8033 


3 


SEQ ID NO: 


3777 


aaatcag aatccagataca 


6888 


6907 SEQ ID NO 


5 


1 1 8tgtaccataagccatattt 


10088 


10107 


3 


SEQ ID NO: 


3778 


gaatccagatacaagaaaa 


6894 


6913 SEQIDNO 


5 


119ttttctaaacttgaaattc 


9065 


9084 


3 


SEQ ID NO: 


3779 


ttaagagacacatacagaa 


6924 


6943 S EQ ID NO 




120ttcttaaacattcctttaa 


9491 


9510 


3 


SEQ ID NO: 


3780 


atccagcacctagctggaa 


6950 


6969 SEQ ID NO 




21 ttccaatttccctgtggat 


3688 


3707 


3 


SEQ ID NO: 


3781 


tgagcatgtcaaacacttt 


7060 


7079 SEQ id no 




22 aaagtgccacttttactca 


6191 


6210 


3 


SEQ ID NO: 


3782 


gagcatgtcaaacactttg 


7061 


7080 S EQ ID NO 


5 


23 caaalgacatgatgggctc 


5334 


5353 


3 


SEQ ID NO: 
SEQ ID NO: 


3783 


aaacactttgttataaatc 


7070 


7089 SEQ id NO 


5 
5 


24gattatatcccatatgttt 


13133 


13152 


3 


SEQ ID NO: 


3784 
3785 


tatgaagtagaccaacaaa 


' ' • ' ' '""S>tU IU NU 

7160 7179 SE QIDNO 


5 


25gaaggaaaagcgcacctoa 
26tttgtggagggtagtcata 


12029 
10331 


12048 
10350 


3 
3 


SEQ ID NO: 


3786 


aagtagaccaacaaatcca 


7164 


7 1 83 SEQIDNO 


5 


27tggatgaagatgacgactt 


12156 


12175 


3 


SEQ ID NO: 


3787 


aagttgaaggagactattc 


7223 


7 242 SEQ id no 


5 


28 gaataccaatgctgaactt 


10168 


10187 


3 


SEQ ID NO: 


3788 


acaagttaagataaaagat 


7264 


7283 SEQ ID no 


5 


29atctaaattcagttcttgt 


11334 


11353 


3 


SEQ ID NO: 


3789 


aagataaaagattactttg 


7271 


7290 SEQ id no 


5 


30 caaaatagaagggaatctt 


2077 


2096 


3 


SEQ ID NO: 


3790 


gattactttgagaaattag 


7280 


7299 SEQ | D N0 


5 


31 ctaaacttgaaattcaatc 


9069 


9088 


3 


SEQ ID NO: 


3791 


tgagaaattagttggattt 


7288 


73 07SEQ ID NO 


5 


32aaatccgtgaggtgactca 


7443 


7462 


3 


SEQ ID NO: 


3792 


aaattagttggatttattg 


7292 


7311 SEQ ID NO 


5 


33caattttgagaatgaattt 


10419 


10438 


3 


SEQ ID NO: 


3793 


tggatttattgatgatgct 


7300 


7319 SEQIDNO 


5 


34agcatgcctagtttctcca 


9953 


9972 


3 


SEQ ID NO: 


3794 


tcattgaagatgttaacaa 


7353 


7372 SEQ ID NO 


5 


35ttgtagatgaaaccaatga 


7422 


7441 


3 


SEQ ID NO: 


3795 


cattgaagatgttaacaaa 


7354 


7373 SEQ ID NO 


5 


36tttgtagatgaaaccaatg 


7421 


7440 


3 


SEQ ID NO: 


3796 


attgaagatgttaacaaat 


7355 


7374 SEQ ID NO 


5 


37atttaagtatgatttcaat 


10495 


10514 


3 


SEQ ID NO: 


3797 


ttgaagatgttaacaaait 


7356 


7375 SEQ ID NO 


5 


38aatttaagtatgatttcaa 


10494 


10513 


3 


SEQ ID NO: 


3798 


tgaagatgttaacaaattc 


7357 


737 6SEQ ID NO 


5 


39gaatltaagtatgatttca 


10493 


10512 1 


3 


SEQ ID NO: 


3799 


acatgttgataaagaaatt 


7380 


7399 SEQ id NO 


5 


40aattccctgaagttgatgt 


11487 


11506 


3 


SEQ ID NO: 


3800 


tttgattaccaccagtttg 


7406 


7425 SE Q ID NO 


5 


41caaattgaacatccccaaa 


8791 


8810 1 


3 


SEQ ID NO: 


3801 


caaaatccgtgaggtgact 


7441 


7460 S EQ ID NO 


5 


42agtccccctaacagatttg 


7972 


7991 1 


3 


SEQ ID NO: 


3802 


aaaalccgtgaggtgactc 


7442 


7 461 SEQ ID NO 


5 


43gagtgaaatgclglttttt 


8638 


8657 1 


3 


SEQ ID NO: 


3803 


aggtgactcagagactcaa 


7452 


7471 SEQ ID NO 


5 


44ttgatgatatetggaacct 


10731 


10750 1 


3 


SEQ ID NO: 


3804 


gtgaaatlcaggctctgga 


7473 


7492 SEQ ID NO 


5 


45tccaatctcctcttttcac 


8409 


8428 1 


3 


SEQ ID NO: 


3805 


gttgcagtgtatctggaaa 


7547 


7 566 SE Q ID NO 


5 


46tttcaagcaaatgcacaac 


8540 




3 


SEQ ID NO: 


3806 


ttaagttcagcatclttgg 


7616 


763 5sEQ ID NO 


5 


47ccaatgctgaactttttaa 


10173 


10192 • 


3 


SEQ ID NO: 


3807 


tgaaggccaaattccgaga 


7641 


7660 SE QIDNO 


5 


48tctcctttcttcatcttca 


10213 


10232 1 


3 


SEQ ID NO: 


3808 


aatgtatcaaatggacatt 


7684 


7703 SEQ ID NO 


5 


49aatgaagtccggattcatt 


11021 


11040 1 


3 


SEQ ID NO: 


3809 




7700 7719 SEQiDNO 


5 


50gttgagaagccccaagaat 


6254 


6273 1 


3 


SEQ ID NO: 


3810 




7722 


77 41 S EQ ID NO 


5 




9377 


9396 1 


3 


SEQ ID NO: 


3811 


cctgtctctggtcagccag 


7723 


^SEQ ID NO 


5 


52ctggacttctctagtcagg 


8810 


8829 1 


3 


SEQ ID NO: 


3812 


ggtcagccaggtttatagc 


7732 


7751 SEQ ID NO 


51 


53gctaaaggagcagttgacc 


10535 


10554 1 


3 
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SEQ ID NO: 


3813 


ccaggtttatagcacactt 


7738 


7757SEQIDNO 


5154aagtccggattcattctgg 


11025 


11044 


3 


SEQ ID NO: 


3814 


gtttatagcacacttgtca 


7742 


7761 SEQ ID NO 


5 1 55 tgacctgtccattcaaaac 


13681 


13700 


3 


SEQ ID NO: 


3815 


acttgtcacctacatttct 


7753 


7772 SE Q ID NO 


5156agaaaaaggggattgaagt 


10283 


10302 


3 


SEQ ID NO: 


3816 


ctgattggtggactcttgc 


7770 


7789 S EQ ID NO 


5157gcaagttaaagaaaatcag 


14026 


14045 


3 


SEQ ID NO: 


3817 


atgaaagcattggtagagc 


7847 


7866 S EQ ID NO 


5 1 58 g ctcatctcctttcttcat 


10208 


10227 


3 


SEQ ID NO: 


3818 


tgaaagcattggtagagca 


7848 


7867 seq |D NO 


5159tgctcatctcctttcttca 


10207 


10226 


3 


SEQ ID NO: 


3819 


gggttcactgttcctgaaa 


7868 


7887 SEQ | D N0 


5160tttcaccatagaaggaccc 


8959 


8978 


3 


SEQ ID NO: 


3820 


tcaagaccatccttgggac 


7887 


7906 SEQ | D no 


5161 gtccccctaacagatltga 


7973 


7992 


3 


SEQ ID NO: 


3821 


ccttgggaccatgcctgcc 


7897 


"916SEQIDNO 


5162ggcaccagggctcggaagg 


13978 


13997 


3 


SEQ ID NO: 


3822 


ttoaggctcttcagaaagc 


7929 


7948 SEQ | D N0 


51 63 gcttgaaggaattcllgaa 


9588 


9607 


3 


SEQ ID NO: 


3823 


ttcagataaacttcaaaga 


8004 


8023 SEQ | D mo 


5164tcttcataagttcaatgaa 


13183 


13202 


3 


SEQ ID NO: 


3824 


acttcaaagacttaaaaaa 


8013 


8032SEQ ID NO 


5165ttttaacaaaagtggaagt 


6829 


6848 


3 


SEQ ID NO: 


3825 


gSflaScttaaca 


8039 


8058 s E Q ID NO 


5166tggagaagcaaatctggat 


9472 


9491 


3 


SEQ ID NO: 








8082 SE Q ID NO 










SEQ ID NO: 


3827 


cattccttcctttacaatt 


8089 


3 1°8 SE Q| DN o 


5168aattccaattttgagaatg 


10414 


10433 


3 


SEQ ID NO: 


3828 


ttgaccagatgctgaacag 


8145 


8164 SEQ ID NO 


5169ctgttgaaagatttateaa 


12932 


12951 


3 


SEQ ID NO: 


3829 


aatcaccctgccagacttc 


8233 


8252 SEQ ID NO 


5170gaagttctcaattttgatt 


8522 


8541 


3 


SEQ ID NO: 


3830 


tgaccttcacataccagaa 


8320 


8339 SE Q |D NO 


5171 ttcttctggaaaagggtca 


8884 


8903 


3 


SEQ ID NO: 


3831 


ttccagcttccccacatct 


8339 


8358 SEQ id N0 


5 1 72 agattctcagatgagggaa 


8921 


8940 


3 


SEQ ID NO: 


3832 


aagctatacagtattctga 


8387 


8406 SEQ ID NO 


5173tcagatggcattgctgctt 


11612 


11631 


3 


SEQ ID NO: 


3833 


attctgaaaatccaatctc 


8399 


8418 SEQ id NO 


5174gagataaccgtgcctgaat 


11552 


11571 


3 


SEQ ID NO: 


3834 


tttcacattagatgcaaat 
caaatgctgacatagggaa 


8422 
8436 


8441 SEQ ID NO 
8455 SEQ id NO 
8527 SE Q ID NO 


5175attttgaaaaaaacagaaa 


9738 
9670 


9757 


3 
3 


SEQ ID NO: 
SEQ ID NO: 


3835 
3836 


gagagtccaaattagaagt 


8508 


5176ttccatcacaaatcctttg 
51 77actttacttcccaactctc 


13410 


9689 
13429 


3 


SEQ ID NO: 


3837 


agagtccaaattagaagtt 


8509 


8528 SEQ id NO 


5 1 78 aacttfactlcccaactct 


13409 


13428 


3 


SEQ ID NO: 


3838 


tctcaattttgattttcaa 


8527 


8546 SE Q ID NO 


5179ttgattcccttftttgaga 


11537 


11556 


3 


SEQ ID NO: 


3839 


caattttgattttcaagca 


853D 


8549 SEQ id NO 


51 80tgctgaatccaaaagattg 


13660 


13679 


3 


SEQ ID NO: 


3840 


aatgcacaactctcaaacc 


8549 


8568SEQ ID NO 


5181 ggtttatcaaggggccatt 


12460 


12479 


3 


SEQ ID NO: 


3841 


agtlctccagcaagtacct 


8604 


8623SEQ ID NO 


5182aggttccatcgtgcaaact 


11388 


11407 


3 


SEQ ID NO: 


3842 


agtacctgagaacggagca 


8616 


8635 SEQ id NO 


51 83tgctccaggagaacttact 


13780 


13799 


3 


SEQ ID NO: 


3843 


tcaaacacagtggcaagtt 


8678 


8697 S EQIDNO 


5184aactctcaagtcaagttga 


13422 


13441 


3 


SEQ ID NO: 


3844 




8751 


8770 SE q|dno 


5185tccattctgaatatattgt 


13380 


13399 


3 


SEQ ID NO: 


3845 


ctggatagcaacactaaat 


8765 


8784 SEQ id NO 


51 86 attttctgaacttccccag 


12702 


12721 


3 


SEQ ID NO: 


3846 


ctgacctgcgcaacgagat 


8829 


S848 SE Q|DNO 


5187 atctgatgaggaaactcag 


12259 


12278 


3 


SEQ ID NO: 


3847 


agatgagggaacacatgaa 


8929 


8948 SE Q ID NO 


5188ttcatgtccctagaaatct 


10038 


10057 


3 


SEQ ID NO: 


3848 


tcaacttttctaaacttga 


9060 


9079 SE Q ID NO 


51 89tcaaggataacgtgtttga 


12618 


12637 


3 


SEQ ID NO: 


3849 


ttctaaacttgaaattcaa 


9067 


9086 SE Q ID NO 


5190ttgatgatgctgtcaagaa 


7308 


7327 


3 


SEQ ID NO: 


3850 


gaaattcaatcacaagtcg 


9077 


9096 SE Q ID NO 


5191 cgacgaagaaaataatttc 


13566 


13585 


3 


SEQ ID NO: 


3851 


cactgtttggagaagggaa 


9141 


9160 S EQ ID NO 


5192ttccagaaagcagccagtg 


12506 


12525 


3 


SEQ ID NO: 


3852 


actgtttggagaagggaag 


9142 


91 61 SEQ ID NO 


51 93 cttccccaaagagaccagt 


2898 


2917 


3 


SEQ ID NO: 


3853 


aattctcttttcttttcag 


9221 


9240 SEQ |D NO 


51 94 ctgattactatgaaaaatt 


13638 


13657 


3 


SEQ ID NO: 


3854 


ttcttttcagcccagccat 


9230 


9249 S EQ ID NO 


51 95 atggaaaagggaaagagaa 


13494 


13513 


3 


SEQ ID NO: 


3855 


tttgaaagttcgttttcca 


9283 


9302 SE Q ID NO 


51 96tggaagtgtcagtggcaaa 


10380 


10399 


3 


SEQ ID NO: 


3856 




9312 


9331 SEQ ID NO 


51 97 aggacclttcaaattcctg 


9848 


9867 


3 


SEQ ID NO: 


3857 


ataagtacaaccaaaattt 


9405 


9424 SE Q ID NO 


51 98 aaatcaggatctgagttat 


14038 


14057 


3 


SEQ ID NO: 


3858 


acaacgagaacattatgga 


9435 


9 454 SEQ ID NO 


5199tccattctgaatatattgt 


13380 


13399 


3 


SEQ ID NO: 


3859 


aggaataaatggagaagca 


9463 


9482SEQ ID NO 


5200tgctggaattgtcattcct 


11734 


11753 


3 


SEQ ID NO: 


3860 


agoaaatotggatttctta 


9473 


9497 SE Q ID NO 


5201taagttctctgtacctgct 


11719 


11738 


3 


SEQ ID NO: 


3861 


tcctttaacaattcctgaa 


9502 


9521 SE Q|DNO 




13206 


13225 


3 


SEQ ID NO: 


3862 


tttaacaattcctgaaatg 


9505 


9524 SE Q ID NO 


5203catttgatttaagtgtaaa 


9621 


9640 


3 
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SEQ ID NO: 


3863 


acacaataatcacaactcc 


9534 9553 SEQ ID NC 


: 5204ggagacagcatcttcgtgt 


11211 


11230 


1 3 


SEQ ID NO: 


3864 


aagatttctctctatggga 


9561 9580 SEQ | DNC 


5205tcccagaaaacctcttctt 


3936 


3955 


1 3 


SEQ ID NO: 


3865 


gaaaaaacaggcttgaagg 


9578 9597 SEQ |DNO 


: 5206ccttttacaattcattttc 


13021 


13040 


1 3 


SEQ ID NO: 


3866 


ttgaaggaatfcttgaaaa 


9590 9609 SE QIDNO 


: 5207ttttgagaatgaatttcaa 


10422 


10441 


1 3 


SEQ ID NO: 


3867 


tgaaggaattcttgaaaac 


9591 9610 SEQ | DNO 


: 5208gtt1iggctgataaattca 


11291 


11310 


1 3 


SEQ ID NO: 


3868 


agctcagtataagaaaaac 


9640 9659 SEQ i DNO 


: 5209gttlgataagtacaaagct 


9805 


9824 


1 3 


SEQ ID NO: 


3869 


tcaaatcctttgacaggca 


9720 9739 SE Q| DN0 


: 5210tgcctgagcagaccattga 


11688 


11707 


1 3 


SEQ ID NO: 


3870 


atgaaacaaaaattaagtt 


9789 9808 SE Q|DNO 


: 5211aactttgcactatgttcat 


12762 


12781 


1 3 


SEQ ID NO: 


3871 


aattcclggatacactgtt 


9859 9878 SEQ | DNO 


5212aacacaigaatcacaaatt 


8938 


8957 


1 3 


SEQ ID NO: 


3872 


ttccagttgtcaatgttga 


9876 9895 SEQ | DNO 


521 3tcaaaacgagcttcaggaa 


13207 


13226 


1 3 


SEQ ID NO: 


3873 


aagtgtctccattcaccat 


9894 9913 SE Q|DNO 


5214atgggaagtalaagaactt 


4842 


4861 


1 3 


SEQ ID NO: 


3874 


gtcagcatgcctagtttct 


9950 9969 SEQ | DNO 


5215agaaaaggcacaccttgac 


11080 


11099 


1 3 


SEQ ID NO: 


3875 


ctgccatgggcaatattac 


1011310132 SEQ | DNO 


5216gtaagaaaatacagagcag 


6440 


6459 


I 3 


SEQ ID NO: 


3876 


tgaataccaatgctgaact 


101671 01 86 SEQ id NO 


5217agttgaaggagactattca 


7224 


7243 


1 3 


SEQ ID NO: 


3877 


tattgttgctcatctcctt 


10201 10220 SEQ IDN0 


521 8aaggaaacataaactaata 


12889 


12908 


1 3 


SEQ ID NO: 


3878 


tgttgctcatctcctttct 


10204 10223 SEQ | D N0 


521 9agaagaaatctgcagaaca 


12431 


12450 


3 


SEQ ID NO: 


3879 


tctgtcattgatgcactgc 


10232 10251 SEQ ID NO 


5220gcagtagactataagcaga 


13928 


13947 


3 


SEQ ID NO: 


3880 


ccacagctctgtctctgag 


10305 10324 SEQ id NO 


5221 ctcagggatctgaaggtgg 


8195 


8214 


3 


SEQ ID NO: 


3881 


atttgtggagggtagtcat 


1033010349 SE Q ID NO 


5222atgaagtagaccaacaaat 


7161 


7180 


3 


SEQ ID NO: 


3882 


atatggaagtgtcagtggc 


10377 10396 SEQ | D N0 


5223 gccacactccaacg catat 


10778 


10797 


3 


SEQ ID NO: 


3883 


tggaaataccaagtcaaaa 


10453 10472 SEQ )D N0 


5224ttttacaattcattttcca 


13023 


13042 


3 


SEQ ID NO: 


3884 


aagtcaaaacctactgtct 


10463 10482 SEQ tD NO 


5225agacctagtgattacactt 


12859 


12878 


3 


SEQ ID NO: 


3885 


actgtctcttcctccatgg 


10475 10494 SEQ iD N0 


5226ccatgcaagtcagcccagt 


10924 


10943 


3 


SEQ ID NO: 


3886 


cttcctccatggaatttaa 


10482 10501 SE Q tD N0 


5227ttaatcgagaggtatgaag 


7148 


7167 


3 


SEQ ID NO: 


3887 


attcttcaatgctgtactc 


10612 10531 SEQ ID NO 


5228gagltgagggtccgggaat 


12242 


12261 


3 


SEQ ID NO: 


3888 


ttgaccacaagcttagctt 


10548 10567 SEQ | DNO 


5231 aagcgcacctcaatatcaa 


12036 


12055 


3 


SEQ ID NO: 


3889 


cctcacctcttacttttcc 


10573 10592SEQ ID NO 


5232ggaactattgctagtgagg 


10649 


10668 


3 


SEQ ID NO: 


3890 


agctgcagggcacttccaa 


10710 10729 SE Q id NO 


5233ttgggaagaagaggcagct 


12289 


12308 


3 


SEQ ID NO: 


3891 


ttccaaaattgatgatatc 


10723 10742 SEQ | D NO 


5234gatatacactagggaggaa 


12745 


12764 


3 


SEQ ID NO: 


3892 


gagaacatacaagcaaagc 


10860 10879 S EQ ID NO 


5235gcttggttttgccagtctc 


2467 


2486 


3 


SEQ ID NO: 


3893 


atggcaaatgtcagctcft 


10897 10916 SE Q id NO 


5236aagaggtatttaaagccat 


12960 


12979 


3 


SEQ ID NO: 


3894 


tggcaaatgtcagctcttg 


10898 10917 SE Q id NO 


5237caagaggtatttaaagcca 


12959 


12978 


3 


SEQ ID NO: 


3895 


ttgttcaggtccatgcaag 


10914 10933 SE Q id NO 


5238cttgggggaggaggaacaa 


14066 


14085 1 


3 


SEQ ID NO: 


3896 


tgttcaggtccatgcaagt 


10915 10934 SE Q| DN o 


5239aottgggggaggaggaaca 


14065 


14084 1 


3 


SEQ ID NO: 


3897 


agttccttccatgatttcc 


109401 0959 SE Q ID NO 


5240ggaatctgatgaggaaact 


12256 


12275 1 


3 


SEQ ID NO: 


3898 


tg ctaacactaagaaccag 


10987 11 006 SE Q id NO 


5241 ctggatgtaaccaccagca 


11186 


11205 1 


3 


SEQ ID NO: 


3899 


actaagaaccagaagatca 


10994 11013 SEQ id no 


5242tgatcaagaacctgttagt 


13347 


13366 1 


3 


SEQ ID NO: 


3900 


ctaagaaccagaagatcag 


10995 11014 SE Q |D NO 


5243ctgatcaagaacctgttag 


13346 


13365 1 


3 


SEQ ID NO: 


3901 


cagaagatcagatggaaaa 


1100311022 SE Q id NO 


5244ttttcagaccaactctctg 


13622 


13641 1 


3 


SEQ ID NO: 


3902 


aaaaatgaagtccggaltc 


11018 11037 SEQ , D no 


5245gaatttgaaagttcgtttt 


9280 


9299 1 


3 


SEQ ID NO: 


3903 


gattcattctgggtctttc 


110321 1051 SEQ id no 


5246gaaaacctatgccttaatc 


13166 


13185 1 


3 


SEQ ID NO" 






1107911098 s E Q ID NO' 


5247tcaaaacctactgtctctt 


10466 


10485 1 


3 


SEQ ID NO: 


3905 


aaggacacctaaggttcct 


1111511134 SE Q IDN0 . 


5248aggacaccaaaataacctt 


7572 


7591 1 


3 


SEQ ID NO: 


3906 


ccagcattggtaggagaca 


1119911218 SE Q|DNO: 


5249tgtcaacaagtaccactgg 


12370 


12389 1 


3 


SEQ ID NO: 


3907 


ctttgtgtacaccaaaaac 


1123911258 SE Q| D NO: 


5250 gtttttaaattgttgaaag 


13148 


13167 1 


3 


SEQ ID NO: 


3908 


ccatccctgtaaaagtttt 


11277 11 296 SEQ ID NO: 


5251 aaaagggtcatggaaatgg 


8893 


8912 1 


3 


SEQ ID NO: 


3909 


tgatctaaattcagttctt 


11332 11351 SEQ | D N0 . 


5252aagatagtcagtctgatca 


13334 


13353 1 


3 


SEQ ID NO: 


3910 


aagaagctgagaacttcat 


1143211451 SE q| D no : 


5253 atgagatcaacacaatctt 


13110 


13129 1 


3 


SEQ ID NO: 


3911 


tttgccctcaacctaccaa 


1145311472 SE Q| D N0 : 


5254ttggtacgagttactcaaa 


12641 


12660 1 


3 


SEQ ID NO: 


3912 


cttgattcccttttttgag 


1153611555 SE Q |D NO: 


5255 ctcaattttgattttcaag 


8528 


8547 1 


3 
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SEQ ID NO: 


3913 


ttcacgcttccaaaaagtg 


1 1 591 1 161 0 SEQ id N0: 5256cactoattgattttctgaa 








SEQ ID NO: 


3914 


tgtttcagatggcattgct 


11608 11627 SEQ | D N0 


: 5257agcagattatgttgaaaca 


11833 


11852 


I 3 


SEQ ID NO: 


3915 


aatgcagtagccaacaaga 


1163911658 SE Q| D N0 


: 5258tcttltcagccoagccatt 


9231 


9250 


1 3 


SEQ ID NO: 


3916 


ctgagcagaccattgagat 


11691 11710 S EQ ID NO 


: 5259atctgaigaggaaactcag 


12259 


12278 


1 3 


SEQ ID NO: 


3917 


tgagcagaccattgagatt 


1169211711 SE q 1D NO 


5260 aatctgatgaggaaactca 


12258 


12277 


1 3 


SEQ ID NO: 


3918 


ttgagattccctccattaa 


1170311722SEQIDNO 


5261 ttaatcttcataagttcaa 


13179 


13198 


1 3 


SEQ ID NO: 


3919 


acttggagtgccagttlga 


1180711826 SE Q1DNO 


5262tcaattgggagagacaagt 


6504 


6523 


1 3 


SEQ ID NO: 


3920 


caaatttgaaggacttcag 


12004 12023 SEQ1DNO 


5263 ctgagaacttcatcalttg 


11438 


11457 


3 


SEQ ID NO: 
SEQ ID NO: 


3921 


agcccagcgttcaccgatc 


1205612075SEQIDNO 


5264gatccaagtatagttggct 


13286 


13305 


3 


SEQ ID NO: 


3922 
3923 




1206012079 SEQ | DNO 
12074 12093 SEQ ID NO 


5266tctgatatacatcacggag 


13960 
13711 


13979 
13730 


3 
3 


SEQ ID NO: 


3924 


atgaggaaactcagatcaa 


12264 12283 SEQ id N0 


5267ttgagttgcccaccatcat 


11667 


11686 


3 


SEQ ID NO: 


3925 


aggcagcttctggcttgct 


12300 12319 SEQ id NO 


5268agcaagtotttcctggcct 


3018 


3037 


3 


SEQ ID NO: 


3926 


tgaaagacaacgtgcccaa 


12327 12346 SEQ , D m 


5269ttgggagagacaagtttca 


6508 


6527 


3 


SEQ ID NO: 


3927 


tatgattatgtcaacaagt 


1236212381 SEQ | DNO 


5270actttgcactatgttcata 


12763 


12782 


3 


SEQ ID NO: 


3928 


cattaggcaaattgatgat 


12475 12494 SEQ lD NO 


5271 atcaacacaatcttcaatg 


13115 


13134 


3 


SEQ ID NO: 


3929 


ttgaotcaggaaggccaag 


1258412603 S EQIDNO 


5272cttggtacgagttactcaa 


12640 


12659 


3 


SEQ ID NO: 


3930 


gaaacctgggatatacact 


12736 12755 SE Q ID NO 


5273agtgattacacttcctttc 


12865 


12884 


3 


SEQ ID NO: 


3931 


tcctttcgagttaaggaaa 


12877 12896 SEQ | D N0 


5274tttctgccactgctcagga 


13524 


13543 


3 


SEQ ID NO: 


3932 


gccattcagtctctcaaga 


12974 12993 SEQ ID NO 


5275tcttccgttctgtaatggc 


5802 


5821 


3 


SEQ ID NO: 


3933 


gtgctacgtaatcttcagg 


13001 13020 SEQ | DN0 


5276cctgcaccaaagctggcac 


13964 


13983 


3 


SEQ ID NO: 


3934 


agctgaaagagatgaaatt 


1306513084 SEQ | DNO 


5277aatttattcaaaacgagct 


13200 


13219 


3 


SEQ ID NO: 


3935 


aatttacttatctlattaa 


13080 13099 S EQ ID NO 


5278ttaaaagaaatcttGaatt 


13815 


13834 


3 


SEQ ID NO: 


3936 


ttttaaattgttgaaagaa 


131501 31 69 SE Q |D NO 


5279ttctctctatgggaaaaaa 


9566 


9585 


3 


SEQ ID NO: 


3937 


taatcttcataagtfcaat 


131801 31 99SEQ ID NO 


5280attgagattccctccatta 


11702 


11721 


3 


SEQ ID NO: 


3938 


atattttgatccaagtata 


13279 13298 SEQ | D N0 


5281 tataagcagaagcacatat 


13937 


13956 1 


3 








13311 13330 SEQ | DN o 


5282tcaaccttaatgattttca 




8314 1 


3 


SEQ ID NO: 


3940 


caatttctgcacagaaata 


13442 13461 SE Q, D NO 


5283tattcttcttttccaattg 


13834 


13853 1 


3 


SEQ ID NO: 


3941 


agaagattgcagagctttc 


13509 13528 SE Q |D NO 


5284gaaatcttcaatttattct 


13821 


13840 1 


3 


SEQ ID NO: 


3942 


gaagaaaataatttctgat 


13570 13589 SEQ | D N0 


5285 atcagttcagataaacttc 


7999 


8018 1 


3 


SEQ ID NO: 


3943 




13680 13699 SEQ id NO 


5286ttttgagaatgaatttcaa 


10422 


10441 1 


3 


SEQ ID NO: 


3944 




136931 371 2 SEQ (D N0 


5287aaattccttgacatgttga 


7370 


7389 1 


3 


SEQ ID NO: 


3945 


ttttttaaaagaaatcttc 


13811 13830 SEQ id N0 


5288gaagtgtcagtggcaaaaa 


10382 


10401 1 


3 


SEQ ID NO: 


3946 


aggatctgagttaltttgc 


14043 14062seq ID NO 




7864 


7883 1 


3 


SEQ ID NO: 


3947 




14057 14076 SEQ !D N0 


5290ctccccaggacctttcaaa 


9842 


9861 1 


3 
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Table 10. Selected palindromic sequences from human gluc nse-6-phosphatase 



SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



5291 tccatcttcaggaagctgt 
5292 ccatcttcaggaagctgtg 



5295ttgaatgtcattttgtggt 
5296 tcagtaatgggggaccac 
5297ttttactgtgcatacatgt 



5301 ttttcctcatcaagttgtt 
5302 ctttcagccacatccacag 
5303 tggactctggagaaagccc 
5304agcctcctcaagaacctgg 
5305 ggcctggggctggctctca 
5306gagctcactcccactggaa 
5307agctaatgaagctattgag 
5308gctaatgaagctattgaga 
5309 ctaaatggctttaattata 

53 1 0 ctgcttttctttttttttc 

531 1 caatcaccaccaagcctgg 
531 2agcctggaataactgcaag 
531 3gttccatcttcaggaagct 
5314tggtgggttttggatactg 



531 6gctgttacagaaactttca 
53 1 7 acagcatctataatgccag 
5318gggtgtagacctcctgtgg 



5321 gacctcctgtggactctgg 



5323 ctgggcacgctctttggcc 
5324 ctggtcttctacgtcttgt 
5325agagtgcggtagtgcccct 
5326 tgggcactggtatttggag 
5327 gaattaaatcacggatggc 



5332ttaaaggaaaagtcaacat 
5333acatcttctctcttttttt 
5334.ttctacgtcctcttcccca 
5335tgggtagctgtgattggag 



Start End 
Index Index 

222 241 SEQ ID NO: 

223 242SEQ ID NO: 

417 436SEQ ID NO: 

418 437SEQ ID NO: 
521 540SEQ ID NO: 

1886 1905SEQIDNO: 

1956 1975SEQIDNO: 

50 69 SEQ ID NO: 

51 70 SEQ ID NO: 
487 506SEQ ID NO: 
598 617SEQIDNO: 
651 670SEQ ID NO: 
776 795SEQ ID NO: 
848 867SEQ ID NO: 
878 897 SEQ id N0: 

1439 1458SEQ ID NO: 

1572 1591 SEQ ID NO: 

1573 1592SEQIDNO: 

1854 1873SEQIDNO: 
2509 2528SEQIDNO: 

0 19SEQ ID NO: 

12 31 SEQ ID NO: 

220 239SEQ ID NO: 

326 345SEQ ID NO: 

392 411 SEQ ID NO: 

638 657SEQ ID NO: 

666 685SEQ ID NO: 

760 779SEQ ID NO: 

761 780SEQ ID NO: 

762 781 SEQ ID NO: 
767 786SEQ ID NO: 

862 881 SEQ ID NO: 

863 882 SEQ ID NO: 
1028 1047SEQIDNO: 
1056 1075SEQIDNO: 
1217 1236SEQIDNO: 
1267 1286SEQIDNO: 
1598 1617SEQIDNO: 
1764 1783SEQIDNO: 

1855 1874SEQIDNO: 
2215 2234SEQIDNO: 
2330 2349SEQIDNO: 
2345 2364SEQIDNO: 

197 216SEQ1DNO; 

. 257 276SEQ ID NO: 



5371 cccattttgaggccagagg 
5372gcccattttgaggccagag 
5373 accatacattatcattcaa 



5375acatctttgaaac 
5376tcatgtctcagcctcctca 
5377 ctcatgtctcagcctcctc 
5378 ggtcgcctggcttattccc 



Start End 
Index Index 
1340 1359 
1339 1358 
1492 1511 
1491 1510 
2945 2964 
2731 2750 



5380ctgtggactctggagaaag 
5381 gggctggctctcaactcca 
5382ccagattcttccactggct 
5383tgagccaccgcaccgggcc 
5384ttccaggtagggccagctc 
5385 ctcagcctcctcagtagct 
53B6 tctcagcctcctcagtagc 
5387tatatttttagaattttag 
5388 gaaaaatatatatgtgcag 




5404 ctcccactggaacagccca 
5405 gccaaccaagagcacattc 



5407 tatcacattacatcatcct 
5408 atatatgtgcagtatttta 
5409gccctccttgcctgttttt 
541 Oatgtgcagtattttattaa 
541 1 aaaagaaaaatatatatgt 
541 2tgggccagccgcacaagaa 



5397tecacattgacaccacacc 
5398 gtccacattgacaccacac 
5399 ccagatattgcactaggtc 
5400 gccagctcacaagcccagg 
5401 ggccagctcacaagcccag 



2620 2639 1 5 

2619 2638 1 5 

1295 1314 1 5 

2982 3001 1 5 

773 792 1 5 

884 903 1 5 

2107 2126 1 5 

2801 2820 1 5 

1676 1695 1 5 

2626 2645 1 5 

2625 2644 1 5 

2683 2702 1 5 

2996 3015 1 5 

812 831 1 4 

1987 2006 1 4 

1440 1459 1 4 

2425 2444 1 4 

782 801 1 4 

1474 1493 1 4 

758 777 1 4 

823 842 1 4 

822 841 1 4 

821 840 1 4 

2014 2033 1 4 

1687 1706 1 4 

1686 1705 1 4 

1663 1682 1 4 

2229 2248 1 4 

1446 1465 1 4 

2311 2330 1 4 

2967 2986 1 4 

2063 2082 1 4 

3003 3022 1 4 

2817 2836 1 4 

3007 3026 1 4 

2992 3011 14 

1116 1135 1 3 

1446 1465 1 3 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



5337cacttccgtgcccctgata 



5341 tgtgcagctgaatgtctgt 
5342 atgtctgtctgtcacgaat 
5343 ctgtcacgaatctaccttg 
5344 atcaagttgttgctggagt 
5345cagaaactttcagccacat 
5346 actttcagccacatccaca 
5347atgccagcctcaagaaata 
5348agaaatattttctcattac 
5349gaaatattttctcattacc 



5351 cctgtggactctggagaaa 



5353ttgaaacccccatcccaag 
5354cagatggaggtgccatatc 



5359 aaagaaggctgcctaagga 

5360 aagaaggctgcctaaggag 

5361 agaaggctgcctaaggagg 
5362 atttccttggatttctgaa 
5363tccttataagcccagctct 
5364ataagcccagctctgcttt 
5365ggccaggattcctctctca 
5366 gccaactcctccttgcctg 
5367 ttttttttctttttttgag 



772 
784 
1004 
1351 
1438 
1553 
1606 
1785 
1786 
1787 
1788 
1982 
2081 
2086 
2231 
2493 
2519 



282SEQ ID NO: 
377 SEQ ID NO: 
483SEQ ID NO: 
487SEQ ID NO: 
511 SEQ ID NO: 
572SEQ ID NO: 
583SEQID NO: 
591 SEQ ID NO: 
625SEQ ID NO: 
664SEQ ID NO: 
669 SEQ ID NO: 
697SEQ ID NO: 
709SEQ ID NO: 
710SEQIDNO: 
763 SE Q ID NO: 
791 SEQ ID NO: 
803 SEQ ID NO: 
1023SEQID NO: 
1370 SEQ ID NO: 
1457SEQID NO: 
1572 SEQ ID NO: 
1625SEQ ID NO: 
1804SEQ ID NO: 
1805SEQ ID NO: 
1806SEQ ID NO: 
1807 SEQ ID NO: 
2001 SEQ ID NO: 
21 00 SEQ ID NO: 
2105SEQ ID NO: 
2250 SEQ id no: 
2512SEQ ID NO: 
2538SEQ ID NO: 
2671 SEQ ID NO: 



541 4gccatgccatgggcacagc 



541 7tgaatactctcacaagtag 
5418ctgtttttcaatctcatct 



5420 attcaggtatagctgacat 

5421 caaggtgctaggattacag 

5422 actcctgacctcaagtgat 
5423 atgtttcaattaggctctg 
5424tgtggcgtatcatgcaagt 
5425tattttttttactgtgcat 



5427 ggtaaatatgactcctttc 



5429tttcatcatgttggccagg 

5430 ccaccgcaccgggccctcc 

5431 cttgaattcctgggctcaa 

5432gatatgcagagtatttctg 

5433tccacctgccttggcctcc 

5434tttctctatcccaagccaa 

5435tecaccccactggatcttc 

5436 ccttgcctgcttttctttt 

5437tccttgcctgcttttcttt 

5438ctccttgcctgcttttctt 

5439cctccttgcctgcttttct 

5440ttcaattaggctctgaaat 

5441 agagcacattcttaaagga 

5442 aaagctgaagcctatttat 



5444caggctggagtggagtggc 
5445ctcataacatctttgaaaa 



423 442 • 
2548 2567 ' 
2705 2724 
1419 1438 
2828 2847 

644 663 
2038 2057 
2779 2798 
2742 2761 
2185 2204 
1818 1837 
1950 1969 
2283 2302 
2282 2301 
2306 2325 
2713 2732 
2805 2824 
2405 2424 
2847 2866 
2760 2779 
2297 2316 
2131 2150 
2503 2522 
2502 2521 
2501 2520 
2500 2519 
2189 2208 
2319 2338 
2889 2908 
2801 2820 
2555 2574 
2977 2996 
2798 2817 
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Table 11. Selected palindromic sequences from rat glucose-6-phosphatase 





Source 


Start End 
Index Index 


fl/latch 


Start End # 
Index index 


B 


SEQ ID NO 


5447 ctgactattacagcaacag 


301 


320SEQ ID NO 


5471 ct t ct aaactttcaq 






6 


SEQ ID NO 


5448 ctcttggggttggggctgg 


831 


850SEQ ID NO 


5472 ccagcatgtaccgcaagag 






6 


SEQ ID NO 


5449tgcaaaggagaactgcgca 


. 879 


898SEQ ID NO 


5473tgcgaccgtcccctttgca 


101Q 


1038 1 




SEQ ID NO 


5450 cctcgggccatgccatggg 


376 


395SEQ ID NO 


5474 cccagtgtggggccagagg 


1171 


1190 1 


g 


SEQ ID NO 


5451 ttgagcaaaccatatgcaa 


14/8 


1497 SEQ ID NO 


5475ttgcagagtgtgtcttcaa 




2076 1 


5 


SEQ ID NO 


5452 cagcttcctgaggtaccaa 


2 


21 SEQ ID NO 


5476 ttggtgtctgtgatcgctg 


123 


142 1 


4 


SEQ ID NO 


5453 ggtaccaaggaggaaggat 


13 


32SEQ ID MO 


5477 atccagtcgactcgctacc 


66 


85 1 




SEQ ID NO 


5454 ctccacgactttgggatcc 


51 


70 SEQ ID NO 


5478 ggatcgggaggagggggag 






4 


SEQ ID NO 


5455 caggactggtttgtcttgg 


108 


127SEQ ID NO 


5479 ccaagcccgactgtgcctg 




2037 1 


4 


SEQ ID NO 


5456 Gttctatgtcctctttccc 


155 


174SEQ ID NO 


5480 gggacagacacacaagaai 


1076 


1095 1 


* 


SEQ ID NO 


5457 ttctatgtcctctttccca 


156 


175 S EQ ID NO 


5481 tgggacagacacacaagaa 


1075 


1094 1 




SEQ ID NO 


5458tggttccacattcaagaga 


177 


196SEQ ID NO 


5482 tctcaataatgatagacca 


1549 


1568 1 


4 


SEQ ID NO 


5459 tgcctctgataaaacagtt 


325 


344SEQ ID NO 


5483 aactctgagatcttgggca 


1868 


1887 1 


4 


SEQ ID NO 


5460agcccggctcctgggacag 


1064 


1083SEQ ID NO 


5484 ctgtcctccagcctgggct 


2034 


2053 1 


4 


SEQ |D NO 


5461 agtctctgacacaagtcag 


1111 


1130SEQID NO 


5485 ctgaatggtaatggtgact 


1659 


1678 1 


4 


SEQ ID NO 


5462 aaaaaggtgaatttttaaa 


1237 


1256SEQ ID NO 


5486 ittattaaaacgacatttt 


2201 


2220 1 


4 


SEQ ID NO 


5463 acactctcaataatgatag 


1545 


1564SEQIDNO 


5487 ctatgaatgatgcctgtgt 


2121 


2140 1 


4 


SEQ ID NO 


5464 aaagaatgaacgtgctcca 


37 


56SEQ ID NO 


5488 tggacctcctgtggacttt 


724 


743 1 


3 


SEQ ID NO 


5465 ctttgggatccagtcgact 


59 


78SEQ ID NO 


5489 agtcagcggccgtgcaaag 


1124 


1143 1 


3 


SEQ ID NO 


5466 gtgatcgctgacctcagga 


132 


151 SEQ ID NO 


5490 tcctctctccaaaggtcac 


1911 


1930 1 


3 


SEQ ID NO 


5467 ggaacgccttctatgtcct 


148 


167SEQ ID NO 


5491 aggactcatcactgcttcc 


1748 


1767 1 


3 


SEQ ID NO 


5468gactgtgggcatcaatetc 


194 


213SEQ ID NO 


5492 gagactggaccagggagtc 


357 


376 1 


3 


SEQ ID NO 


5469 ggacactgactattacagc 


296 


315SEQ ID NO 




518 


537 1 


3 


SEQ ID NO 


5470aagcccccgtcccagattg 


966 


985SEQ ID NO: 5494caattgtttgctggtgctt 


1833 


1852 1 


3 
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Table 12. Selected palindromic sequences from human B-catenin 



5495 agcagcttcagtccccgcc 
5496ccattctggtgccactacc 
5497tccttctctgagtggtaaa 



5501 taaatgacgaggaccaggt 



5504tcccctgagggtatttgaa 



Start End 
Index Index 

70 89SEQ ID NO: 
304 323SEQIDNO: 
328 347 SEQ ID NO: 
334 353SEQ ID NO: 
473 492SEQIDNO: 

677 696SEQIDNO: 

678 697 SEQ ID NO: 
383 402SEQ ID NO: 

1839 1858SEQIDNO: 
143 162SEQ ID NO: 
151 170SEQ ID NO: 
260 279 SEQ , DNO : 

383 402SEQIDNO: 

384 403SEQIDNO: 
454 473SEQ ID NO: 
563 582SEQIDNO: 

■ 623 642SEQIDNO: 
718 737 SE Q|DNO: 
915 934 S EQIDNO: 
1291 1310SEQIDNO: 
1356 1375SEQ ID NO: 
5516tgtccttcgggctggtgac 1549 1568 SEQ (D N0: 
5517cacagctcctctgacagag 2107 2126SEQ ID NO: 
5518ccagacagaaaagcggctg 245 264seq ID NO: 
5519cagcagcgttggcccggcc 4 23SEQIDNO: 
5520aggtctgaggagcagcttc 60 79SEQ ID NO: 
5521 actgttttgaaaatccagc 1 74 1 93 SEQ ID NO: 
5522ctgatttgatggagttgga 213 232 SEQ (D N0: 
5523ccagacagaaaagcggctg 245 264 gEQ )D NO : 
5524acagctccttctctgagtg 323 342SEQ ID NO: 
5525tggatacctcccaagtcct 
5526tcaagaacaagtagctgat 

5527 agctcagagggtacgagct 

5528 gcatgcagatcccatctac 
5529 ccacacgtgcaatccctga 
5530cacacgtgcaatccctgaa 
5531 ggaccttgcataacctttc 



5506gctgttagtcactggcagc 
5507gtcctgtatgagtgggaac 



551 0 gtccagcgtttggctgaac 

551 1 tatcaagatgatgcagaac 
551 2tatggtccatcagctttct 
5513ccctggtgaaaatgcttgg 
55 1 4 agctttaggacttcacctg 



5532ctccacaaccttttattac 



5534ggactctcaggaatctttc 
5535tgatataaatgtggtcacc 
5536cccagcgccgtacgtGcat 



5538ttgtaccggagcccttcac 



369 388SEQ ID NO: 
424 443SEQ ID NO: 
469 488SEQ ID NO: 
516 535SEQIDNO: 

645 664SEQIDNO: 

646 665SEQIDNO: 
846 865SEQ1DNO: 
974 993SEQIDNO: 

1222 1241 SEQ ID NO: 
1347 1366SEQIDNO: 
1435 1454SEQIDNO: 
1839 1858SEQIDNO: 
1852 1871 SEQ ID NO: 
1915 1934SEQIDNO: 
1962 1981SEQIDNO: 



5542ggcgacatatgcagctgct 
5543ggtatggaccccatgatgg 



5545attgtacgtaccatgcaga 
5546tagctgcaggggtcctctg 
5547cctgtaaatcatcctttag 
5548acctgtaaatcatccttta 
5549gttccgaatgtctgaggac 



5551 1 

5552tggttaagctcttacaccc 
5553gctgcctccaggtgacagc 



Start End # B 
Index Index 

2152 2171 1 5 

2387 2406 1 5 

985 1004 1 5 

791 810 1 5 

2037 2056 1 5 

2539 2558 1 5 

2538 2557 1 5 

2176 2195 2 4 

2451 2470 2 4 

1929 1948 1 4 

1680 1699 1 4 

2494 2513 1 4 

1652 1671 1 4 

2175 2194 1 4 

2517 2536 1 4 

1652 1671 1 4 

1820 1839 1 4 

1126 1145 1 4 

2029 2048 1 4 

2502 2521 1 4 

2162 2181 1 ■ 4 

1605 1624 1 4 

5564ctctaggaatgaaggtgtg 2134 2153 1 4 

5565cagctcgttgtaccgctgg 828 847 2 3 

5566ggccaccaccctggtgctg 2420 2439 1 3 

5567gaagaggatgtggatacct 359 378 1 3 

5568gctgatattgatggacagt 437 456 1 3 

5569tccaggtgacagcaatcag 2500 2519 1 3 



5556agctggcctggtttgatac 
5557 gttcgccttcactatggac 



5561 caggtgacagcaatcagct 
5562 gcagctgctgttttgttcc 



5570 cagcaacagtcttacctgg 
5571 cactgagcctgccatctgt 
5572 aggactaaataccattcca 
5573 atcagctggcctggtttga 
5574agctggtggaatgcaagct 
5575gtagaagctggtggaatgc 
5576tcagatgatataaatgtgg 
5577ttcagatgatataaatgtg 
5578 gaaatcttgccctttgtcc 
5579 gtaaatcatcctttaggag 
5580tagctgcaggggtcctctg 
5581 gaaatcttgccctttgtcc 



5583 atggccaggatgccttggg 



275 294 1 3 

1579 1598 1 3 

1972 1991 1 3 

2514 2533 1 3 

. 1276 1295 1 3 

1271 1290 1 3 

1430 1449 1 3 

1429 1448 1 3 

1743 1762 1 3 

2542 2561 1 3 

2037 2056 1 3 

1743 1762 1 3 

1562 1581 1 3 

2370 2389 1 3 

2053 2072 1 3 

2055 2074 1 3 

2553 2572 1 3 
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SEQIDNO: 5540gaagctattgaagctgagg 2084 2103SEQ ID NO: 5587cctctgacagagttacttc 2114 2133 1 
SEQIDNO: 5541 tcagaacagagccaatggc 2247 2266seq ID NO: 5588gccaccaccctggtgctga 2421 2440 1 
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Table 13. Selected palindromic sequences from human hepatitis C virus (HCV) 



5589 cagcacctgggtgctggta 



5596 ctcaccacccagaacaccc 



5597 ccagccttaccatcaccca 



5603 ggtggtcagatcgttggtg 



ccttggcccctctatggca 



5606 gggcacgctgcccgcctca 



5607 ctgcaatgactecctccag 



561 2 cgtccgttgccggagcgca 



5613gtctggcattattgaccti 



5614 tctttgatatcaccaaact 



561 5 cttatgattgccatactcg 



56 1 7 gggacatcatcctgggcct 



561 8 gggcgtcttccgggccgct 



5619 ggcgtcttccgggccgctg 



5620 gcgtcttccgggccgctgt 



5333 SEQ ID NO:_ 
1701 SEQ ID NO: _ 
_^SEQID NO:_ 
1371 SEQ1DN0:_ 
2067 SEQ ID NO: 
2068 SEQ ID N0:_ 
5575 SEQ ID NO:_ 
5763| SEQ ID N0:_ 
ID NO: 



5621 gtccccggtcttcacagac 



5622 catcaggactggggtaagg 



5624 ggggggaaggcacctcatt 



5625 l ccgagcaattcaagcagaa 



5626 |agatgaaggcaaaggcgtc 



5627|cccctagggggcgctgcca 



5628| ctcccggcctagttggggc 



5629 |ttccgctcgtcggcggccc 
563a cccctagggggcgctgcca" 
5631 |gccccgccggcatgcgaca" 



6269 SEQ ID NO:_ 
8235 SEQ ID NO:_ 
1449 SEQ ID NO:_ 
SEQ ID NO:_ 
SEQ ID NO: 
603SEQIDNO:_ 
1284 SEQ ID NO:_ 



1508 1527 SEQ ID NO:_ 



1624 1643 SEQ ID NO: 



1897 191 6 SEQ ID NO:_ 



2032 2051 SEQ ID NO:_ 
2238| 2257 SEQ ID NO:_ 
2307 SEQ ID NO:_ 
2632 SE Q id N0 :~ 
SEQ ID NO:_ 



3033 SEQ ID NO:_ 
3333 SEQ ID NO:_ 
3343 SEQ ID NO:_ 
3893 seq | D no: 



3895 SEQ ID NO:_ 
3980SEQIDNO:_ 



4520SEQ ID NO: 



7840 SEQ ID NO: 



769_SEQIDNO:_ 



1222 1241 SEQ ID NO: 





t/latch 


Start 
Index 


End 
Index 


it 


B 


6135 


accatcacccagctgctg 


6196 


6215 


1 


9 


6136 


xgggcagcgggtcgagtt 


8202 


8221 


1 


8 


6137 


gagagcgacgccgcagcg 


6151 


6170 


1 


7 


: 6138 


;ggcatgtgggcccgggag 


6053 


6072 


1 


7 


: 6139 


^gacccctcccacattaca 


6871 


6890 


1 


7 


: 6140 


xgacccctcccacattac 


6870 


6889 


1 


7 


: 6141 


ccggctggttcgttgctg 


9254 


9273 


1 


7 


: 6142 


gggtgtgcacggtgttgag 


6291 


6310 


1 


7 


: 6143 


gggcgctggtatcgctgg 


5832 


5851 


1 


7 


: 6144 


gagcccgaaccggacgtag 


6830 


6849 


1 


7 


: 6145 


cgagcccgaaccggacgta 


6829 


6848 


1 


7 


: 6146 


aggctatgactaggtactc 


8634 


8653 


1 


. 7 


: 6147 


agcgcattttcactccat 


9019 


9038 


— - 


— - 


: 6148 


gttgccgctaccttaggtt 


4115 


4134 


1 


... 6 


- 6149 


caccagcccgctcaccacc 


5734 


5753 


1 


6 


: 6150 


tgccaacgtgggtacaagg 


6374 


6393 


1 


6 


: 6151 


ctgacgactagctgcggta 


8465 


8484 


1 


6 


; 6152 


tgagacgacgaccgtgccc 


4759 


4778 


1 


6 


; 6153 


ctggtggccctcaatgcag 








— | 




gttgccgctaccttaggtt 












acaccacgggcccctgcac 


~6537 







— g 

— - 
— - 


: 6156 


tgctcaatgtcctacacat 


7610 






): 6157 


ccaagctcaaactcactcc 


"Hi 










tgcgagcccgaaccggacg 






1 


6 


): 6159 


aag gtcacctttgacagac 








— ^ 


): 6160 


agttcgatgaaatggaaga 








— - 


): 6161 


cgagcaattcaagcagaag 


~551f 








): 6162 


tgatcacgccatgcgccgc 


764' 


7660 


1 


6 


): 6163 


aggcggtggattttgtccc 


3915 


3934 


1 


6 


•v 6164 


agcggcacggcgaccgccc 


7439 


7458 


1 


6 




cagcggcacggcgaccgcc 


~7^i 




-I 


6 




acaggtgccctgatcacgc 




765C 


1 


6 


y. 6167 


gtcttggaagaacccggac 


7252 


7271 


1 


6 


3: 6168 


ccttcctcaagccgtgatg 


8155 


8174 


1 


6 


y 61 6£ 


ccgggggaacggccctcgg 


485C 


4872 


1 


6 


): 617C 


aatgttgtgacttggcccc 


8334 


835C 


1 


6 


y. 6171 


ttctgattgccatactcgg 


301 f 


3034 


1 


6 


y. 6172 


gacgaccttgtcgttatct 


856^ 


858C 


1 


6 


y. 617: 


tggccggcgccccccgggg 


367^ 


369C 




5 


y. 617^ 


gcccccccttgagggggag 


751 £ 


7538 


2 


5 


y. 617, 


gggcaaaggacgtccggaa 


792C 


7942 


2 


5 


y. B17( 


tggcgggggcccactgggg 


138C 


1402 


2 


5 


y. 617 


tgtcccagggggggagggc 


91 4' 


916€ 


2 


5 
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5632 aggacgaccgggtcctttc 



5634 aaaaccaaacgtaacacca 



5635 caaccgccgcccacaggai 



5640 gttggggccccacggaccc 



5641 ttggggccccacggacccc 



5642 tggggccccacggaccccc 



5643 cctcacatgcggcctcgcc 



5644 cacatgcggcctcgccgac 



5645 tccgctcgtcggcggcccc 



5646 ggcgctgccagggccttgg 



ccatgtcacgaacgactgc 



5649 tgccctgcgttcgggaggg 



5650 gccctgcgttcgggagggt 

5651 aggaatgctaccatcccca 



5653 atacgacaccacgtcgatt 



5654 atttgctcgttggggcggc 



5675 
5676 
5677 



5655 ccttctcgccccgccggca 



tcgtccggatgcccggagc 



5663 gacaaccgatcgtctcggc 



aggccacgtactcaaaatg 



tgtatgtggggggcgtgga 



5669 gagtggcaggttctgccct 



toctttgcaatcaaatggg 



5670 
5671 
5672 

5673 gcggcatatgctttctatg 



agcccaggccgaggccgcc 



ggcggcatatgctttctat 



5676 cccccctcaacgtccgggg 



444 463 SEQ 



450 469 SEQ 



460 479 SEQ 



657 676 SEQ 



658 677 SEQ 



659 678 SEQ 



715 734 SEQ 



1128 1147SEQ 



j^SEQ 



1040 SEQ 



1215 1234 SEQ 



1266 1285 SEQ 



ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 



ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
ID NO: 
3574 1 3593[ SEQ ID NO: 
3575 3594 SEQ ID NO: 



3420 S EQ |[ 



61 78]gaaaaaggacggttgtcct 


7341 


736C 


1 


5 


— 


agaaaaaggacggttgtcc 


734C 


7359 




5 




tggtttttttttttttttt 


9443 


9462 


1 


5 




gtcctgaacccgtctgttg 


4100 


411G 


1 


5 


6182 




4754 


4773 


1 


5 


6183 


ccccggccacg^^ 


1267 


1286 


1 


5 


6184 


ctgggcgcgctgacgggca 


3164 


3183 


1 


5 
5 


6185 
6186 


gggtgggtagccgcccaac 


9296 
5783 


9315 
5802 


1 
1 


5 


6187 


ggggtgggtagccgcccaa 


5782 


5801 


1 


5 


6188 


gggggtgggtagccgccca 


5781 


5800 


1 


5 


6189 


ggcggggcgacaatagagg 


3774 


3793 


1 


5 


6190 


gtcgtcggagtcgtgtgtg 


6020 


6039 


1 


5 


6191 


ggggcaaaggacgtccgga 


7922 


7941 


1 


5 


6192 


ccaagccacagtgtgcgcc 


5110 


5129 


1 


5 


6193 


gcagcaacacgtggcatgg 


6498 


6517 


1 


5 


6194 


cctcacaacgggggggcac 


1495 


1514 


1 


5 




ccctcacaacgggggggca 


1494 


1513 


1 


5 




accctcacaacgggggggc 


1493 


1512 


1 


5 


6197 


tgggcatcggcacagtcct 


4323 


4342 


1 


5 


6198 


cgtattcccagatttggga 


8092 


8111 


1 


5 


6199 


aatcaatgctgtagcgtat 


4576 


4595 




5 


6200 


gccgccacttgcggcaaat 


9164 


9183 


1 


5 


6201 


tgccaacgtgggtacaagg 








5 


6202 


cctgccgcggttaccgggt 


6340 


6359 




— jj 


6203 


actgcgtcggcatgtgggc 


6046 








6204 


ctggtatcgctggtgcggc 


5838 


5857 




— g 


6205 


gggacagatcggagctcag 




2332 


-] 


5 


6206 


gcggcgagcctacgagtct 


8609 


8628 


-] 


5 


6207 


gctccgggggcgcttacga 


4257 


4276 


1 


5 


6208 


gataacttcccctacctgg 


5084 


5103 


1 


5 


6209 


gccgcggttaccgggtgtc 


6343 


6362 




5 


6210 


ggggtctcccccctccttg 


6919 


6938 


1 


5 


6211 


acgggcgcccccattacgt 


4202 


4221 


1 


5 




ggccgctgtatgcacccgg 


3886 


3905 


1 


5 






3137 


3156 


1 


5 




cca^gtggcccatctaca — 


4011 


4030 


1 


5 


6215 


agggcaggggtggcgactc 


3400 


3419 


1 


.5 


6216 


cccaccttatgggcaagga 


8861 


8880 


1 


5 


6217 


ggcgtccacagtcaaggct 


7834 


7853 


1 


5 


6218 


atagaagaagcctgccgcc 


7865 


7884 


1 


5 


6219 


catagaagaagcctgccgc 


7864 


7883 


1 


5 


6220 


ccatagaagaagcctgccg 


7863 


7882 


1 


5 


6221 


ggggggacggcatcatgca 


6402 


6421 


1 


5 


6222 


CGCcaatcgatgaacgggg 


9376 


9395 


1 


5 


6223 


ggaggccgcaagccagccc 


8066 


8085 


1 


5 


6224 


atggtaccgaccctaacat 


4158 


4177 


1 


5 


6225 


catggtaccgaccctaaca 


4157 


4176 


1 


5 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



5682 caccatgcacctgtggcag 



5689 gtgggtaggatcatcttgt 



5697 ggcgccaaactattccaag 



5701 gcggcgatacccatatggg 



5703 cccccccttgagggggagc 



5704 ctgctgctcaatgtcctac 



5705 catggacaggtgccctgat 



5707 ggctatgactaggtactcc 



5708 cac catagatcactcccct 



5709 agctgttcaccttctcgcc 



5711 _ 
5712tc 



571 3 gggacatcatcctgggcct 



5714 gggagatactcctggggcc 



57 1 6 ccagccttaccatcaccca 



571 7 gccctccttgagggcgaca 



571 8 ccagcccccgattgggggc 



571 9 accatagatcactcccctg 



3714|sEQ ID NO: 6226tgcacgatgctcgtgaacg 



3723 SEQ ID NO: 6227 tgccgcggttaccgggtgt 



SEQ ID NO: 6228 otgccgcggttaccgggtg 



4344 SEQ ID NO: 6229ccaggattgcccgtttgcc 



4366 SEQ ID NO: 6230 gctccccccagcgctgctt 
4380 SEQ ID NO: 6231 



gcacggcgaccgcccctcc 



4508 SEQ ID NO: 6232tccccccagcgctgcttcg 



5184 SEQ ID NO: 6233 gccggattacaatcctcca 



SEQ ID NO: 6234 actcgcgatcccaccaccc 



5409SEQ ID NO: 6235 acaacatggtctacgccac 
5534SEQID NO: 6236 



561 1 SEQ ID NO: 6237gctcctcatacggattcca 



5622 SEQ ID NO: 6238aggtgccctgatcacgcca 



5758 SEQ ID NO: 6239ttctggcgggctatggggc 



6325 SEQ ID NO: 6240 caggctataaaatcgctca 



6475 SEQ ID NO: 6241 atggtaccgaccctaacat 



6507 SEQ ID NO: 6242lgttcctccaatgtgtcgg 



6584 SEQ ID NO: 6243 cttgaaagcctctgccgcc 



6986SEQ ID NO: 6244tgtctcctacttgaagggc 



SEQ ID NO: 6245 



ctgcacgccttccccggcg 



71 57 SEQ ID MO: 6246 ttcatgctgtgcctactcc 
7221 SEQ ID NO:_ 



7320 SEQ ID NO: _6248 gggccgccacttgcggcaa 



7539SEQ ID NO: 6249 gctcccggcctagttgggg 



ctccggtggtacacgggtg 



7629 s EQ ID NO: 6250gtaggactggcaggggcag 
7645SEQ1D NO: 6251 



7646SEQID NO: 6252 



8654 SEQ ID NO: 6253ggagcaacttgaaaaagcc 



46 SEQ ID NO: 6254agggccttggcacatggtg 



1225SEQ ID NO: 6255 bgcgtgctgacgactagct 



1624 1643 SEQ ID NO: 6256'ctggtgcggctgttggcag 



5721 gtgcagcctccaggacccc 



5722 tgcagcctccaggaccccc 



5723 ccaggaccccccctcccgg 



5724 accccccctcccgggagag 



5725 ccccctcccgggagagcca 



5727 |agccgagtagtgttgggtc 



2257 SEQ ID NO: 6257 tgctgcgccatcacaacat 



3341 SEQ ID NO: 6258gcccaactcgcteccccca 



3343 SEQ ID NO: 6259aggcaggagataacltccc 



3385 SEQ ID NO: 6260 ggcccctgcacgccttccc 



3593 SEQ ID NO: 6261 atggtctacgccacgacat 



6208SEQ ID NO: 6262tgggtacaagggagtctgg 



6986SEQ ID NO: 6263 tgtcccagggggggagggc 



114SEQ ID NO: 6266gaggccgcgatgccatcat 



132 SEQ ID NO: 6269ccggctggttogttgctgg 



137 SEQ ID NO : _6270 ctctcatgccaacgtgggt 



243 262 SEQ ID NO: 6272 actatgcggtccccggtct 



270 SEQ ID NO: 6273|gaccaggatctcgtcggct 



SEQ ID NO: 6265cagggccttggcacatggt 



123 SEQ ID NO: 6267 gggggacggcatcatgcac 



124 SEQ ID NO: 6268 ggggggacggcatcatgca 



140SEQ ID NO: 6271 tggcaatgagggcatgggg 
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5728 ggtgcttgcgagtgccccg 

5729 gcgagtgccccgggaggtc 
5730 accgtgcaccatgagcacg 
5731 



5732 gccgcgcaggggccccagg 



5733 accccgtggaaggcgacag 



5734 ccccgtggaaggcgacagc 



5736 ctatccccaaggctcgccg 



5737 tatccccaaggctcgccgg 



5738 cgggtatccttggcccctc 



5740 tcctgtcaccccgcggctc 
5741 gggccccacggacccccgg 



5743 cggcctcgccgacctcatg 



5744 ggcctcgccgacctcatgg 



5745 ggccccctagggggcgctg 



5746 tggcacatggtgtccgggt 



5747 cttcctcttggctctgctg 



5749 gaggcggcggacttgatca 



5750 catccccactacgacaata 

5751 gctgttcaccttotcgccc 



5752 gccccgccggcatgcgaca 



5753 tggcctgggacatgatgat 



5754 cacaagccgtcatcgacat 



5756 ggtggcgggggcccactgg 



5757 gggggcccactggggagtc 



5758 atggcggggaactgggcta 



5759ttgattgtgatgctacttt 



5761 acgctgcccgcctcaccag 



5768 tgtggtccagtgtattgct 



5770 ctgttgtcgtggggacaac 
5771 g 
5772 g 



gccgccgcaaggcaactgg 



5774 |ccccgtgtaacateggggg [ 2043 



318SEQ ID NO: 
325SEQID NO: 



531 SEQ |[ 
_547SEQ IE 
550 SEQ ID NO: 



68JSEQ ID NO: 



783 SEQ IE 



811 



1900 



887 SEQ ID NO: 
963 SEQ ID NO: 

1002 SEQ l[ 

111q SEQ ID NO:_ 

1226 SEQ ID NO: 

1241 

1312 SEQ ID NO 
1381 SEQ IE 
1385 SEQ ID NO: 
140Q SEQ IE 
1406 SEQ II 
1449 SEQ It 
1473 SEQ It 
1519 SEQ ID NO:_ 
1531 SEQ ID NO: 



1583 SEQ IE 
1605SEQ IE 
1606SEQ ID NO: 
1768 SEQ IDNO:" 
1783 SEQ ID NO: 
1863| SEQ ID NO:] 
ID NO:_ 
ID NO: 



1991 SEQ 



2002 SEQ ID NO: 
2062 SEQ ID NO: 



): 6274 


cggggccttggttgacacc 


2135 


215 




4 


): 627e 


gacccccggcgtaggtcgc 


67 


69( 




4 


): 6276 


cgtgcaatacctgtacggt 


243" 


245( 




4 


): 6277 


gatcatgcatactcccggg 


99" 


101C 




4 


): 6278 


cctgcacgccttccccggc 


654? 


656£ 




4 


): 627° 


ctgtatgcacccggggggt 


389' 


391C 




4 


): 628C 


gctgtatgcacccgggggg 


389C 


390E 




4 


): 6281 


cgagggcagggcctgggct 


55C 


572 


1 


4 


): 6282 


cggctgtcgttcccgatag 


541 E 


543" 


' 


4 


): 6283 


ccggctgtcgttcccgata 


5417 


5436 


1 


4 


j. 6284 


gaggccgcaagccagcccg 


8067 


8086 


1 


4 


): 6285 


catcgataccctcacatgc 


706 


72E 


1 


4 


): 6286 


gagctgcaaagctccagga 


8523 


8542 


1 


4 


): 6287 


ccggccgcatatgcggccc 


4064 


408C 


1 


4 


: 6288 


gccggccgcatatgcggcc 


4063 


4082 


1 


4 


: 6289 


catgaggatcatcgggccg, 


6472 


6491 


1 


4 


: 6290 


ccatgaggatcatcgggcc 


6471 


649C 


1 


4 


: 6291 


cagctccgaattgtcggcc 


7414 


7433 


1 


4 


. 6292 


acccacgctgcacgggcca 


5188 


5207 


1 


4 


: 6293 


cagcataggtcttgggaag 


5863 


5882 


1 


4 


: 6294 


agcagtgctcacttccatg 


6847 


6866 


1 


4 


: 6295 


:gatggcattcacagcctc 


5712 


5731 


1 


4 


: 6296 


tattaccggggtcttgatg 


4592 


4611 


1 


4 


: 6297 


gggctgcgtgggaaacagc 


8793 


8812 


1 


4 


; 6298 


tgtctcctacttgaagggc 


3814 


3833 


1 


4 


: 6299 


atcaatttgctccctgcca 


5981 


6000 


1 


4 


: 6300 


atgtttgggactgggtgtg 


6279 


6298 


1 


4 


; 6301 


caccaagcaggcggaggct 


5560 


5579 


1 


4 


: 6302 


ccagggctcaggccccacc 


5127 


5146 


1 


4 


: 6303 


gactaggtactccgccccc 


8641 


8660 


1 


4 


: 6304 


agcagtgctcacttccat 


6846 


6865 


1 


4 


: 6305 


aaagcaagctgcccatcaa 


7665 


7684 


1 


4 


: 6306 


gcagaaggcgctcgggttg 


5530 


5549 


1 


4 


: 5307 


ctggacccgaggagagcgt 


2278 


2297 


1 


4 


: 6308 


atatcgggggtcccctga 


8393 


8412 


1 


4 


: 6309 


gtggctcggggccttggt 


2132 


2151 


1 


4 


: 6310 


atgtggctcggggccttgg 


2131 


2150 


1 


4 


: 6311 


caggactggggtaaggac 


4176 


4195 


1 


4 


: 6312 


gggtggcttcatgcctcag 


9063 


9082 


1 


4 


: 6313 


acactccagttaactcctg 


8817 


8836 


1 


4 


. 6314 


agcagggccatcaaccaca 


7949 


7968 


1 


4 


. 6315 


acagcagaggcggctaagc 


6887 


6906 


1 


4 


: 6316 


gttgcaacttggacgacag 


2295 


2314 


1 


4 


: 6317 


;cagttggacttatccggc 


9241 


9260 


1 


4 


6318 


acacgggtgcccattgcc 


7287 


7306 


1 


4 


631 agtacacgggtgcccattgc 


7286 


7305 


1 


4 


6320|ccGcaatcgatgaacgggg 


9376 


9395 


1 


4 
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5775 ggactgcttccggaagcac 

5776 gactgcttccggaagcacc 

5777 tccggaagcaccccgaggc 

5778 actcaaaatgtggctcggg 

5779 ggccttggttgacacctag 



5782 eagatcggagctcagcccg 



5783 ggagctcagcccgctgctg 



5784 caccctaccggctctgtcc 



5785 cggctctgtccactggctt 



5786 ccatcagaacatcgtggac 



5787 ggtcagcggttgtctcctt 



gctgcatcgtgcggaggcg 



5791 
5792 
5793 

5794 cgccatattacaaggtgtt 



attattgaccttgtcgcca 



tcgccatattacaaggtgt 



5795 gtccggggaggccgcgatg 



5796 tcaccccactgcgggattg 



5797 ttgggcccacgccggccta 



5798 ctacgggaccttgcggtag 



5800 ctgtcgtcttctctgacat 



cctggggggcagacaccgc 



5803 ggcgtgtggggacatcatc 



5804 tggggccggccgatagtct 



5806 }gagggggaggttcaagtgg 



5807 aggcccaatcgcccagatg 



5809 caggatctcgtcggctggc 



58 1 0 aggatctcgtcggctggcc 



581 1 gccccccggggcgcgttcc 



58 1 2 gcacctgtggcagctcgga 



581 3 ctgtggcagctcggacctt 



581 6 gagcttgctctcccccagg 



581 / acttgaagggctcttcggg 



581 8 tgtccccgttgagtccatg 



581 9 gaaactactatgcggtccc 



2111[ SEQ ID NO:_ 
2112 SEQ ID NO:_ 
2119 SEQID NO: 
2143 SEQ ID NO:_ 
2161 SEQ ID N0:_ 
2306 SEQ ID NO:" 
_2333_SEQID N0:_ 
2336 SEQ ID N0:_ 
2342 SEQ ID N0:_ 
2402 SEQ IDNO:_ 
2410 S EQ ID NO:_ 
2438 SEQ ID N0:_ 
2479 SEQ ID NO:_ 
2598 SEQ ID NO:_ 
2601 SEQ IDNO:_ 
2640 SEQ IDNO:_ 
2805 SEQ ID NO:_ 
2843SEQIDNO: 



2856 SEQ ID N0:_ 
2857 SEQ ID N0:_ 
2958 SEQ ID N0:_ 
3220SEQ ID NO: 



3252SEQID NO: 
3279SEQIDN0:_ 
3280 SEQ ID NO: 
3316SEQ1DNO: 
3320 SEQ IDNO:_ 



3316 3335 SEQ ID NO:_ 



3378 3397SEQIDNO: 



3499 3518SEQIDNO: 



3509 3528 SEQ ID NO: 



3625 3644 SEQ ID NO: 



3659 3678SEQ ID NO: 



3660 3679 SEQ ID NO: 
3682| 3701 SEQ ID NO:_ 
3730 SEQ ID NO:~ 
3734 SEQIDNO: 
3794 SEQ IDNO:_ 



3792, 3811 SEQ ID NO: 



3947 3966 SEQ ID NO: 



3948 3967 SEQ ID NO: 



4032 4051 SEQ ID NO: 



4138| 4157| SEQ ID NO: 



3947SEQ ID NO:_ 



6321! 


gtgctggtaggcggagtcc 


5324 


5343 


1 


4 




ggtgctggtaggcggagtc 


5323 


5342 


1 


4 


: 6323' 


gcctacgagtcttcacgga 


8616 


8635 


1 


4 


: 6324cccgggcagcgggtcgagt 


8201 


8220 


1 


4 


: 6325ctagccggcccaaaaggcc 


3611 


3630 


1 


4 


- 6326 


Daagccgtgatgggctcct 


8162 


8181 


1 


4 


. 6327 


gctgggggtcattatgtcc 


3128 


3147 


1 


4 


: 6328 


3gggtggcccactgctctg 


3837 


3856 


1 


4 


■ 6329 


cagctgctgaagaggctcc 


6206 


6225 


1 


4 


: 6330 


ggactgggtgtgcacggtg 


6286 


6305 


1 


4 


6331 


aagcaggcggaggctgccg 


5564 


5583 


1 


4 






3929 


3948 


1 


4 




^gatgattctga^acc — 


8875 


8894 


1 


4 




icagttggacttatccggc 


9241 


9260| 


1 


4 




ccaccaagcaggcggaggc 


5559 


5578 


1 


4 




jgattgggcccacgccggc 


3214 


3233 


1 


4 




jgccacgacatcccgcagc 


7726 


7745 


1 


4 




ggcaacagacgctctaat 


4647 


4666 


1 


4 




acacaatctttcctggcga 


3539 


3558 


1 


A 




aacacaatctttcctggcg — 


3538 


3557 


1 


4 




,a cggcacag cc gg 


4327 


4346 


1 


4 




,aa accaa g g g 


8325 


8344 


1 


4 




.aggctaggggccgtccaa 


5221 


5240 


1 


4 






9338 


9357 


1 


4 


; 6345 


tgtcctacacatggacagg 


7617 


7636 


1 


4 


: 6346 


atgtcctacacatggacag 


7616 


7635 


1 


4 


: 6347 


gcggggtaggactggcagg 


4804 


4823 


1 


4 


>: 6348 


cgcccaactcgctcccccc 


5794 


5813 


1 


4 


>: 6349 


gatgttattccggtgcgcc 


3755 


3774 


1 


4 


): 6350 


agacgacgaccgtgcccca 


4761 


4780 




4 


): 6351 


ctccacctafggcaagttc 


4222 


4241 


1 


4 


): 6352 


ccacctgtcaaggcccctc 


7304 


, 7323 


1 


4 


): 6353 


catcccgcagcgcgggcct 


7734 


7753 


1 


4 


) : 6354 


acatcccgcagcgcgggcc 


7733 


7752 


1 


4 


): 6355 


gccaataggccatttcctg 


941 C 


942S 


1 


4 


): 6356 


ggccaataggccatttcct 


940S 


9428 


1 


4 


); 6357 


ggaacctatccagcagggc 


7938 


7957 


1 


4 


); 6358 


tccggtggtacacgggtgc 


727S 


7296 


1 


4 


): 635S 


aaggcaaaggcgtccacag 


782C 


784E 


1 


4 


): 636C 


ccctgcctgggaaccccgc 


5682 


5701 


1 


4 


): 6361 


ctggttgggtcacagctcc 


6806 


682£ 


1 


4 


): 6362 


cctggttgggtcacagctc 


680£ 


6824 


1 


4 


): 636C 


cccgtggtggagtccaagt 


558E 


5604 


1 


4 


): 6364 


catggtctacgccacgaca 


7717 


7736 


1 


4 


): 6366 


gggaaggcacctcattttc 


450^ 


452C 


1 


4 


): 6366 


ggggggcatatacaggttt 


4826 


484 _ 


1 


4 


5: 636' 


ttgccaggaccatctggag 


499C 


5012 


1 


4 


y, 6368|tgctcgccaccgctacgcc 


437" 


4396 


1 


4 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 






gtgctcgccaccgctacgc 


4376 


4395 


1 


4 




ggtaaccatgtctccccca 


6119 


6138 




4 


6371 


gggcgctggtatcgctggt 


5833 


5852 


1 


4 


6372 


ttgccccaaccagaatacg 


8669 


8688 


1 


4 


6373 


tccgtgagccgcatgactg 


9560 


9579 


1 


4 


6374 


atgagcggcgaggcgccct 


5948 


5967 


1 


4 


6375 


cgcatgactgcagagagtg 


9569 


9588 


1 


4 


6376 


aatacgacttggagttgat 


8682 


8701 


1 


4 


6377 


gtctcccccacgcactatg 


6128 


6147 


1 


4 


6378 


ccctgccatcctctctcct 


5992 


6011 


1 


4 


6379 


atgctcaccgacccctccc 


6863 


6882 


1 


4 


6380 


gaggccgcaagccagcccg 


8067 


8086 


1 


4 




cggggacttgccccaacca 






— | 


— ^ 




gg ggc cca.c agccc — 












99 9 B -r-r — 


9517 




— ij 


— ~ A 




agg ggccagggggcc 


1$908 




— ^ 


— 4 






"7906 




-I 


4 




1 I it — tcatci — 

ggcc c c gcaga ca g 


~959l 


~9"6T 










7885 




■\ 


4 




tgacgcccccacattcggc 


788^ 


7903 


•j 


4 


6389 


gtgcccatgtcaggttcca 


6676 


6695 


1 


4 


6390 


cctacacatggacaggtgc 


7620 


7639 


1 


4 


6391 


gatcatcgggccgaaaacc 


6478 


.6497 


1 


4 


6392 


agagcggctttatatcggg 


8383 


8402 


1 


4 


6393 


ggcgcgctcgtggccttca 


5924 


5943 


1 


4 


6394 


tccattgttagagtcttgg 


7240 


7259 


1 


4 


6395 


actgcacgatgctcgtgaa 


8541 


8560 


1 


4 


6396 


caggggtggctggcgcgct 


5913 


5932 


1 


4 


6397 


tgggcgctggtatcgctgg 


5832 


5851 


1 


4 


6398 


cagcagggccatcaaccac 


7948 


7967 


1 


4 


6399 


atgtggtctccacccttcc 


8142 


8161 


1 


4 


6400 


cgcccctcctgaccagacc 


7453 


7472 


1 


4 


6401 


cctccttgagggcgacatg 


6969 


6988 


1 


4 


6402 


ccctccttgagggcgacat 


6968 


6987 


1 


4 


6403 


tcatgctcctctatgcccc 


7505 


7524 


1 


4 


6404 


taccaccacgagcttacgc 


2751 


2770 


1 


4 


6405 


gggggagccgggggacccc 


7531 


7550 


1 


4 


6406 


cttcgagcggagggggatg 


7130 


7149 


1 


4 


6407 


cacggcgaccgcccctcct 


7444 


7463 


1 


4 


6408 


actgcacgatgctcgtgaa 


8541 


8560 


1 


4 


6409 


ccgggacgtgcttaaggag 


7804 


7823 


1 


4 


6410 


cgtggaggtcacgcgggtg 


6613 


6632 


1 


4 


6411 


cccctccaataccacctcc 


7317 


7336 


1 


4 


6412 


cccctcctgaccagacctc 


7455 


7474 


1 


4 


6413 


aggagatgggcggaaacat 


7059 


7078 


1 


4 


: 6414 


gccgtgatgggctcctcat 


8165 


8184 


1 


4 


6415 


caagtggcgagctttggag 


5599 


5618 


1 


4 






8409 


8428 


1 


4 
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SEQIDNO: 5871 
SEQIDNO: 5872 


accacctccacggagaaaa 


7327 


7346 


SEQ ID NO: 6417 


Itttttccctctttatggt 


9502 


9521 


1 


4 


ccacctccacggagaaaaa 


7328 


7347 


SEQ ID NO: 6418 


ttttccctctttatggtgg 


9504 


952C 


1 


4 


SEQIDNO: 58/3 
SEQIDNO: 5874 


acctccacggagaaaaagg 


733C 


7349 


SEQ ID NO: 6419 


cctttgacagactgcaggt 


777C 


778G 


1 


4 


ggttgtcctgacggactcc 


7351 


737C 


SEQ ID NO: 6420 


ggagctcgctaccaaaacc 


7390 


740S 


1 


4 


SEQIDNO: 5876 
SEQIDNO: 5876 


cctgaccagacctccgaca 


7460 


7479 


SEQ ID NO: 6421 


tgtcctacacatggacagg 


7617 


7636 


1 


4 


agcaagctgcccatcaacg 


7667 


7686 


SEQ ID NO: 6422 


cgttgagcaactctttgct 


7686 


770S 


1 


4 


SEQIDNO: 58// 
SEQIDNO: 5878 


ggatgaccattaccgggac 


7792 


7811 


SEQ ID NO: 6423 


gtcccagttggacttatcc 


9238 


9257 


1 


4 


tggcaaagaatgaggtttt 


8028 


8047 


SEQ ID NO: 6424 


aaaaagccctggattgcca 


8931 


895C 


1 


4 


SEQIDNO: bb/9 
SEQIDNO: 5880 
SEQIDNO: 5881 
SEQIDNO: 5882 
SEQIDNO: 5883 
SEQIDNO: 5884 
SEQIDNO: 5885 
SEQIDNO: 5886 
SEQIDNO: 5887 
SEQIDNO: 5888 
SEQIDNO: 5889 
SEQIDNO: 5890 
SEQIDNO: 5891 
SEQIDNO: 5892 
SEQIDNO: 5893 


ggcaaagaatgaggttttc 


8029 


8048 


SEQ ID NO: 6425 


gaaaaagccctggattgcc 


8930 


8949 


' 


4 


gggcagcgggtcgagttcc 


8204 


8223 


SEQ ID NO: 6426 


ggaagaaagcaagctgccc 


7660 


7679 


1 


4 


gactagctgcggtaatacc 


8470 


8489 


SEQ ID NO: 6427 


ggtaccgcccttgcgagtc 


9091 


9110 


1 


4 


ctcgcgatcccaccacccc 


8766 


8785 


SEQ ID NO: 6428 


ggggtaccgcccttgcgag 


9089 


9108 


1 


4 


aggatgattctgaigaccc 


8876 


8895 


SEQ ID NO: 6429 


gggtcagcggttgtctcct 


2459 


2478 


1 


4 


agccacttgacctacctca 


8976 


8995 


SEQ ID NO: 6430 


tgagalcaatagggtggct 


9052 


9071 


1 


4 


gggtaccgcccttgcgagt 


9090 


9109 


SEQ ID NO: 6431 


actcgcgatcccaccaccc 


8765 


8784 


1 


4 


ctgcaatgactccctccag 


1624 


1643 


SEQ ID NO: 6432 


ctggcgggctatggggcag 


5897 


5916 


3 


3 


ccagcccccgattgggggc 


1 


20 


SEQ ID NO: 6433 


gcccactggggagtcctgg 


1391 


1410 


2 


3 








SEQ ID NO: 643^ 


gggggtctcccccctcctt 










ggccccacggacccccggc 


662 


681 


SEQ ID NO: 6435 


gccgcaaagctgtcaggcc 


4553 


4572 


2 


3 




983 


1002 


SEQ ID NO" 6436 












ctgcaattgttcgatctac 


1249 


1268 


SEQ ID NO: 6437 


gtaggcggagtcctcgcag 


5330 


5349 


2 


3 


ctccagactgggtttcttg 


1637 


1856 


SEQ ID NO: 6438 


caagtggcgagctttggag 


5599 


5618 


2 


3 


tcgtacctgcgtcgcaggt 


1830 


1849 


SEQ ID NO: 6439 


acctcagatcattgaacga 


8989 


9008 


2 


3 


SEQIDNO: 5894 
SEQIDNO: 5895 
SEQIDNO: 5896 
SEQIDNO: 5897 
SEQIDNO: 5898 
SEQIDNO: 5899 
SEQIDNO: 5900 
SEQIDNO: 5901 
SEQIDNO: 5902 
SEQIDNO: 5903 
SEQ ID NO: 5904 
SEQIDNO: 5905 
SEQIDNO: 5906 
SEQIDNO: 5907 
SEQIDNO: 5908 
SEQ ID NO: 5909 
SEQIDNO: 5910 
SEQIDNO: 5911 
SEQIDNO: 5912 
SEQIDNO: 5913 
SEQIDNO: 5914 
SEQIDNO: 5915 
SEQIDNO: 5916 
SEQIDNO: 5917 




2026 


2045 


SEQ ID NO: 6440 


gggggagggccgccacttg 


9156 


9175 


2 


3 


aatgctgcatgcaactgga 


2264 


2283 


SEQ ID NO: 6441 


tccaggccaataggccatt 


9405 


9424 


2 


3 


caccctaccggctctgtcc 
cgccatattacaaggtgtt 


2383 
2838 


2402 
2857 


SEQ ID NO: 6442 


ggactacgtccctccggtg 


7267 
5554 


7286 
5573 


2 
2 


3 
3 


cgaagccatcaagggggga 


4489 
5736 


4508 
5755 


SEQ ID NO: 6443 
SEQ ID NO: 6444 


tcccagatttgggagttcg 


8097 


8116 


2 


3 


ccagcccgctcaccaccca 
ggctatgactaggtactcc 


8635 


8654 


SEQ ID NO: 6445 
SEQ ID NO: 6446 


tgggtacaagggagtctgg 
ggagacatatatcacagcc 


6382 
9284 


6401 
9303 


2 
2 


3 
3 


ctccaccatagatcactcc 


• 24 


43 


SEQIDNO: 6447 


ggagacatcgggccaggag 


9111 


9130 


1 


3 


tccaccatagatcactccc 


25 


44 


SEQ ID NO: 6448 


gggagttcgatgaaatgga 


5451 


5470 


1 


3 


caccatagatcactcccct 


27 


46 


SEQ ID NO: 6449 


aggggccccaggttgggtg 


458 


477 


1 


3 


tcactcccctgtgaggaac 


36 


55 


SEQ ID NO: 6450 


gttctggaggacggcgtga 


809 


828 


1 


3 


cgttagtatgagtgtcgtg 


88 


107 


SEQ ID NO: 6451 


cacgctgcacgggccaacg 


5191 


5210 


1 


3 


tgtcgtgcagcctccagga 


100 


119 


SEQ ID NO: 6452 


tcctgttgtcgtggggaca 


1879 


1898 


1 


3 


ccccccctcccgggagagc 


119 


138 


SEQ ID NO: 6453 


gctcccggcctagttgggg 


645 


664 


1 


3 


ggagagccatagtggtctg 


131 


150 


SEQ ID NO: 6454 


cagatcattgaacgactcc 


8993 


9012 


1 


3 


gagccatagtggtctgcgg 






SEQ ID NO: 6455 


ccgctgctgggtagcgctc 






1 


3 


gtggtctgcggaaccggtg 


142 


161 


SEQ ID NO: 6456 


cacccatatagatgcccac 


5038 


5057 


1 


3 


agiacaccggaattgccag 


161 


180 


SEQ ID NO: 6457 


ctggcgggccttgcctact 


1406 


1425 


1 


3 


ggtcctttcttggatcaac 


188 


207 


SEQ ID NO: 6458 


gttgagtgacttcaagacc 


6304 


6323 


1 


3 


ttcttggatcaacccgctc 


194 


213 


SEQ ID NO: 6459 


gagcggagggggatgagaa 


7134 


7153 


1 


3 


ctcaatgcctggagatttg 


210 


229 


SEQ ID NO: 6460 


caaagactccgacgctgag 


7486 


7505 


1 


3 


tgcctggagatttgggcgt 


215 


234 


SEQ ID NO: 6461 


acgcggccgccgcaaggca 


1967 


1986 


1 


3 


gcctggagatttgggcgtg 


216 


235 


SEQ ID NO: 6462 


cacgcggccgccgcaaggc 


1966 


1985 


1 


3 


gagatttgggcgtgccccc 


221 


240 


SEQ ID NO: 6463 


ggggacaaccgatcgtctc 


1891 


1910 


1 


3 
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SEQIDNO: 5918 
SEQIDNO: 5919 
SEQIDNO: 5920 
SEQIDNO: 5921 
SEQIDNO: 5922 
SEQIDNO: 5923 
SEQIDNO: 5924 
SEQIDNO: 5925 
SEQIDNO: 5926 
SEQ ID NO: 5927 
SEQIDNO: 5928 
SEQIDNO: 5929 
SEQIDNO: 5930 
SEQIDNO: 5931 
SEQIDNO: 5932 
SEQIDNO: 5933 
SEQIDNO: 5934 
SEQIDNO: 5935 
SEQIDNO: 5936 
SEQIDNO: 5937 
SEQIDNO: 5938 
SEQIDNO: 5939 
SEQIDNO: 5940 
SEQIDNO: 5941 
SEQIDNO: 5942 
SEQIDNO: 5943 
SEQIDNO: 5944 
SEQIDNO: 5945 
SEQIDNO: 5946 
SEQIDNO: 5947 
SEQIDNO: 5948 
SEQIDNO: 5949 
SEQIDNO: 5950 
SEQIDNO: 5951 
SEQIDNO: 5952 
SEQIDNO: 5953 
SEQIDNO: 5954 
SEQIDNO: 5955 
SEQ ID NO: 5956 
SEQIDNO: 5957 
SEQIDNO: 5958 
SEQIDNO: 595S 
SEQIDNO: 596C 
SEQIDNO: 5961 
SEQIDNO: 5962 
SEQIDNO: 596C 
SEQIDNO: 5964 


aaaggccttgtggtactgc 


273 


292 


SEQ ID NO: 6464 


gcagaagaaggtcaccttt 












274 


293 


SEQ ID NO' 6465 


jgcagaagaaggtcacctt 


7755 


7774 


1 


3 


gtggtactgcctgataggg 


282 


301 


SEQ ID NO: 6466 


icctaccggctctgtccac 


2385 


2404 


1 


3 


xtgatagggtgcttgcga 


291 


310 


ecn in M/*v 6467 


cgccggcccgagggcagg 


544 


563 


1 


3 


-gagtgccccgggaggtct 






ann in Mrv R46R 


agacgcagtgtcgcgctcg 


4780 


4799 


1 


3 


gccccgggaggtctcgtag 


|^ 


— IP 


ccpi in Mrv R4R C 


ctaccttaggttttggggc 


4122 


4141 


1 


3 


tacctgttgccgcgcagg 






SEQ ID NO: 6470 


.ctgcgttcgggagggtaa 


1023 


1042 


1 


3 


acctgttgccgcgcaggg 






SEQ ID NO: 6471 




1022 


1041 


1 


3 


sctgttgccgcgcaggggc 


— — 




. „ R479 

SEQ ID NO: D4/z 


- — CC a aaaccaaacaag 
gCCC — _ 


8348 


8367 


1 


3 


ctgttgccgcgcaggggcc 


446 


465 


SEQ ID NO: 6473 


ggcccccgaagccagacag 


8347 


8366 


1 


3 


ccgagcggtcgcaacccc 


497 


516 


SEQ ID NO: 6474 


ggggcaaaggacgtccgga 


7922 


7941 


1 


3 


ggtcgcaaccccgtggaag 


504 


523 


SEQ ID NO: 6475 


cttctetgacatggagacc 


3268 


3287 


1 


3 


gtcgcaaccccgtggaagg 


505 


524 


SEQ ID NO: 6476 


ccttcaccattgagacgac 


4749 


4768 


1 


3 


aaggcgacagcctatcccc 


520 


539 


SEQ ID NO: 6477 


ggggcgctgccagggcctt 


774 


793 


1 


3 


cagcctatccccaaggctc 


527 


546 


SEQ ID NO: 6478 


gagcacaggcttaatgctg 


2252 


2271 


1 


3 


gagggcagggcctgggctc 


554 


573 


SEQ ID NO: 6479 


gagcgtcttcacaggcctc 


5020 


5039 


1 


3 


cagggcctgggctcagccc 


559 


578 


SEQ ID NO: 6480 


gggcatcggcacagtcctg 


4324 


4343 


1 


3 


gggcctgggctcagcccgg 


561 


580 


SEQ ID NO: 6481 


ccggccgcatatgcggccc 


4064 


4083 


1 


3 


cctgggctcagcccgggta 


564 


583 


SEQ ID NO: 6482 


taccgaccctaacatcagg 


4162 


4181 


1 


3 


cccctctatggcaatgagg 


590 


609 


SEQ ID NO: 6483 


cctcgccgacctcatgggg 


727 


746 


1 


3 


gagggcatggggtgggcag 


605 


624 


SEQ ID NO: 6484 


ctgcggatctgttttcctc 


1180 


1199 


1 


3 


agggcatggggtgggcagg 


606 


625 


SEQ ID NO: 6485 


cctgctctttcaccaccct 


2370 


2389 


1 


3 


aggatggctcctgtcaccc 


622 


641 


SEQ ID NO: 6486 


gggtcagcggttgtctcct 


2459 


2478 


1 


3 


gatggctcctgtcaccccg 


624 


643 


SEQ ID NO: 6487 


cgggggcgcttacgacatc 


4261 


4280 


1 


3 


tgtcaccccgcggctcccg 


633 


652 


SEQ ID NO: 6488 


cggggcgcgttccctgaca 


3688 


3707 


1 


3 


gtcaccccgcggctcccgg 


634 


653 


SEQ ID NO: 6489 


ccggggcgcgttccctgac 


3687 


3706 


1 


3 


gcggctcccggcctagttg 


642 


661 


SEQ ID NO: 6490 


caacgtccggggaggccgc 


2935 


2954 


1 


3 


ctcccggcctagttggggc 


646 


665 


SEQ ID NO: 6491 


gccctgtcgaacactggag 


4439 


4458 


1 


3 


ataccctcacatgcggcct 


711 


730 


SEQ ID NO: 6492 


aggcaacattatcatgtat 


8839 


8858 


1 


3 


ttccgctcgtcggcggccc 


750 


769 


SEQ ID NO: 6493 


gggcaaagcacatgtggaa 


5625 


5644 


1 


3 


cccctagggggcgctgcca 


767 


786 


SEQ ID NO: 6494 


tggcaatgagggcatgggg 


598 


617 


1 


3 


tgcaacagggaacctgccc 


832 


851 


SEQ ID NO: 6495 


gggctcattcgtgcatgca 


3092 


3111 


1 


3 


gcgtaacgcgtccggggta 


922 


941 


SEQ ID NO: 6496 


taccaccacgagcttacgc 


2751 


2770 


1 


3 


tcaagcattgtgtttgagg 


968 


987 


SEQ ID NO: 6497 


cctctatgcccccccttga 


7512 


7531 


1 


3 


cccacgctcgcggccagga 


1070 


1089 


SEQ ID NO: 6498 


tcctgtttaacatcttggg 


5763 


5782 


1 


3 


cggccaggaatgctaccat 


1080 


1099 


SEQ ID NO: 6499 


atggcatgcatgtcggccg 


5279 


5298 


1 


3 


acgacaatacgacaccacg 


1106 


1125 


SEQ ID NO: 6500 


cgtggggacaaccgatcgt 


1888 


1907 


1 


3 


gggcggctgctctctgctc 


1140 


1159 


SEQ ID NO: 6501 


gagcaacttgaaaaagccc 


8921 


8940 


1 


3 


cgtgggggacctctgcgga 






SEQ ID NO: 6502 


tccgttgccggagcgcacg 










agctgttcaccttctcgcc 


1206 


1225 


SEQ ID NO: 6503 


ggcgacaatagagggagct 


3779 


3798 


1 


3 


ctgttcaccttctcgcccc 


1208 


1227 


SEQ ID NO: 6504 


ggggagacatatatcacag 


9282 


9301 


1 




ctgcaattgttcgatctac 


124S 


1268 


SEQ ID NO: 6505 


gtaggactggcaggggcag 


4809 


4828 


1 


3 


attgttcgatctaccccgg 


1254 


1273 


SEQ ID NO: 6 506 


ccggcccaaaaggcccaat 


3615 


3634 


1 


3 


atctaccccggccacgcgt 


1262 


1281 


SEQ ID NO: 6507 


acgccatggaccgggagat 


2766 


2785 


1 


3 


cggccacgcgtcaggtcac 


127C 


128E 


SEQ ID NO: 6508 


gtgatgctactttttgccg 


1460 


147S 


1 


3 


ccgcatggcctgggacatg 


1286 


1307 


SEQ ID NO: 650S 


catggaaactactatgcgg 


3943 


3962 


1 


3 


cgcagttactccggatccc 


1344 


1363 


SEQIDNO: 65 1C 


gggaacccaggaggatgcg 


8593 


8612 


1 


3 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
, SEQ ID NO: 
. SEQ ID NO:. 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



5966ctggggagtcctggcgggc 1396 



5970 gggggggcacgctgcccgc 



5971 ggggcacgctgcccgcctc 



ctccagactgggtttcttg 



5975 l cccggagcgcatggccagc 



5976 ctgccgctccattgacaag 



5977 aagttcgaccagggatggg 



5979 ccagaggccttattgctgg 



5981 tcgtacctgcgtcgcaggt 



5982 tgcgtcgcaggtgtgtggt 



5984 cagctggggggagaacgat 



5985 cgccgcaaggcaactggtt 



5989 gttcaccaagacgtgcggg 



5992 taacaccttgacctgcccc 



6000 gacagatcggagctcagcc 
6001 



6003 ggcttgatccacctccatc 



6004 gtcagcggttgtctccttt 



6005 gagtatgtcgtgttgcttt 



6008 agaacctggtggccctcaa 



6009 tacatcaagggcaggctgg 

601 0 caagggcaggctggtccct 
601 1 |gcatggccgctgctcctgc 



1993 SEQ1DNQ:_ 



6511 




5306 


5325 


1 


3 


6512 


gcccggagcgcatggccag 


1695 


1714 


1 


3 


6513 


atagaagaagcctgccgcc 


7865 


7884 


1 


3 


6514 


gcccccacattcggccaaa 


7888 


7907 


1 


3 


• 6515 


ccccaatatcgaggaggtg 


4420 


4439 


1 


3 


6516 


gcggcacggcgaccgcccc 


7440 


7459 


1 


3 


: 6517 


gagggagcttgctctcccc 


3789 


3808 


1 


3 


. 6518 


accctcacaacgggggggc 


1493 


1512 


1 


3 


: 6519 


ggttatcgtgggtaggat 


5382 


5401 


1 


3 


• 6520 


caagcggagacggctggag 


4346 


4365 


1 


3 


• 6521 


jctgtgggcgtcttccggg 


3869 


3888 


1 


3 


: 6522 


cttggtacatcaagggcag 


2667 


2686 


1 


3 


: 6523 


cccaaccagaatacgactt 


8673 


8692 


1 


3 


: 6524 


gcatgtgtgggttcccccc 


2914 


2933 


1 


3 


: 6525 


ccaggatctcgtcggctgg 


3658 


3677 


1 


3 


: 6526 


accaagatcatcacctggg 


3284 


3303 


1 


3 


: 6527 


accttcaccattgagacga 


4748 




1 


3 


: 6528 


accatgtctcccccacgca 


6123 


6142 


1 


3 


• 6529 


agacgacgaccgtgcccca 


4761 


4780 




3 


: 6530 


atcggagctcagcccgctg 


2320 


2339 


1 


3 


. 6531 


aacccaggaggatgcggcg 


8596 


8615 


1 


3 


■ 6532 


gaacccaggaggatgcggc 


8595 


8614 


1 


3 


. 6533 


gctataaaatcgctcacag 


8366 


8385 


1 


3 


: 6534 


tgctgctcaatgtcctaca 


7607 


7626 


1 


3 


• 6535 


cccgctcaccacccagaac 


5740 


5759 


1 


3 


: 6536 


ggggaggttcaagtggtct 


3512 


3531 


1 


3 


: 6537 


ccccaatcgatgaacgggg 


9376 


9395 


1 


3 


: 6538 


ggggacgaccttgtcgtta 


8561 


8580 


1 


3 


• 6539 


caggaggatgcggcgagcc 


8600 


8619 


1 


3 


: 6540 


tggatggggtgcggttgca 


6717 


6736 


1 


3 


: 6541 


gcatcatgcacaccacctg 


6411 


6430 


1 


3 


; 6542 


tccatggtcttagcgcatt 


9009 


9028 


1 


3 


: 6543 


cgggaccttgcggtagcag 


3236 


3255 


1 


3 


; 6544 


ctcttacgggatgaggttg 


6761 


6780 


1 


3 


): 6545 


gctctcccccaggcctgtc 


379S 


3818 


1 


3 


): 654€ 


ggctggagcgcggcttgtc 


4357 


4376 


1 


3 


): 6547 


gggccaacgcccctgctgt 


5201 


5220 


1 


3 


): 6548 


ggagagggggccgtgcagt 


606£ 


6087 


1 


3 


): 654S 


gatgatgctgctgatagcc 


2551 


257C 


1 


3 


): 655C 


aaaggacggttgtcctgac 


7344 


7363 


1 


3 


): 6551 


aaagaccaagctcaaactc 


9202 


9221 


1 


3 


); 6552 


atcactgatggcattcaca 


5707 


5726 


1 


3 


): 655C 


ttctgattgccatactcgg 


301 £ 


3034 


1 


3 


); 655' 


ttgatatcaccaaacttct 


300C 


301S 


1 


3 


): 655f 


ccagatgtacactaatgta 


363" 


3656 


1 


3 


): 655£ 


aggggtaggcatctacttg 


935J 


9374 


1 


3 


): 655" 


gcagtgctcacttccatgc 


684£ 


6867 


1 


3 
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SEQ ID NO: 6012 
SEQ ID NO: 6013 
SEQ ID NO: 6014 
SEQ ID NO: 6015 
SEQ ID NO: 6016 
SEQ ID NO: 6017 
SEQ ID NO: 6018 
SEQ ID NO: 6019 
SEQ ID NO: 6020 
SEQ ID NO: 6021 
SEQ ID NO: 6022 
SEQ ID NO: 6023 
SEQ JD NO: 6024 
SEQ ID NO: 6025 
SEQ ID NO: 6026 
SEQ ID NO: 6027 
SEQ ID NO: 6028 
SEQ ID NO: 6029 
SEQ ID NO: 6030 
SEQ ID NO: 6031 
SEQ ID NO: 6032 
SEQ ID NO: 6033 


-atggccgctgctcctgct 


2721 


2740 


SEQ ID NO. 
SEQ ID NO: 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 


6558 


agcagtgctcacttccatg 


6847 


6866 


1 


3 


gccgctgctcctgctcctc 


2725 


2744 


6559 


gagggccgccacttgcggc 


9160 


9179 


1 


3 


ggagatggctgcatcgtgc 


2779 


2798 


6560 


gcacggcgaccgcccctcc 


7443 


7462 


1 


3 


atggctgcatcgtgcggag 








,tccaggccaataggccat 


9404 


9423 




3 


ggcgcggtttttgtgggtc 


1 






jacca accacgggcgcc 


4192 


4211 


•j 


3 


cttatcaccagagctgag 


288 f 


2906 




;tcacaggccgggacaaga 


3482 


3501 


— | 


— f- 


gtgtgggttccccccctca 


2918 


2937 




gaggtcaccctcacacac 








— - 


ccccccctcaacgtccgg 


2926 


2945 


6565 


ccggctcgtggctgaggga 










ctcaacgtccggggaggcc 








ijgcctgttactccattgag 


8959 


8978 


1 


3 


accaaacttctgattgcca 








ggctctctacgatgtggt 


8130 


8149 


1 


3 


caaacttctgattgccata 








atgacacccgctgttttg 


8267 


8286 


1 


3 


ggaccgctcatggtgctcc 


3Q32 






ijgagatcctgcggaagtcc 


7171 


7190 


1 


3 


jaccgctcatggtgctcca 






6570 


ggaaactactatgcggtc 


3945 


3964 


1 


3 


atgcatgttagtgcggaaa 










9348 


9367 


1 


3 


tatgtccaaatggcctfcc 






6572 


jjaagccagacaggctataa 


8354 


8373 


1 


3 


ccaaatggccttcatgaga — 


314 ^ 




6573 


ctcagcgacgggtcttgg 


7552 


7571 


1 


3 


ccttcatgagactgggcgc 


"315^ 




6574 


jcgctcgtggccttcaagg 


5927 


5946 


1 


3 


ccttgcggtagcagtggag 


3241 


3260 


6575 


ctccgcccgaaggggaagg 


3349 


3368 


1 


3 


tgtcgtcttctctgacatg 


3262 


3281 




catggtctacgccacgaca 


77 V 






— | 


tggggggcagacaccgcgg 


3299 


3318 




ccgccttatcgtattccca 




— 




— - 


ggggggcagacaccgcggc 


3300 


3319 




jccgcccaactcgctcccc 


— — 


-— — 




— - 


gtggggacatcatcctggg 


3321 


3340 




cccatctacacgctcccac 




_403. 







SEQ ID NO: 6034 


tggggacatcatcctgggc 


3322 


3341 


6580 


gcccatctacacgctccca 


_401£ 


_4Q3t 


— - 


— | 


SEQ ID NO: 6035 
SEQ ID NO: 6036 
SEQ ID NO: 6037 
SEQ ID NO: 6038 
SEQ ID NO: 6039 
SEQ ID NO: 6040 
SEQ ID NO: 6041 
SEQ ID NO: 6042 
SEQ ID NO: 6043 
SEQ ID NO: 6044 
SEQ ID NO: 6045 
SEQ ID NO: 6046 
SEQ ID NO: 6047 
SEQ ID NO: 6048 

ocn in MfV fifl^ 
otU IU W\J. uunu 

SEQ ID NO: 6050 
SEQ ID NO: 6051 
SEQ ID NO: 6052 
SEQ ID NO: 605C 
SEQ ID NO: 6054 
SEQ ID NO: 605E 
SEQ ID NO: 605e 
SEQ ID NO: 6057 
SEQ ID NO: 605£ 


ggggacatcatcctgggcc 


3323 


3342 




ggccagggggtctcccccc 








— - 


acctgtctccgcccgaagg 


3343 


3362 




cctttgacagactgcaggt 


~tHi 




— ^ 




tgtctccgcccgaagggga 


3346 


3365 


6583 


tccccggtcttcacagaca 


3962 


3981 


1 


3 


gggagatactcctggggcc 


3366 1 3385 


6584 


ggcccatctacacgctccD 


4018 


4037 


1 


3 


ctcccaacagacccggggc 


3439 


3458 


6585 


gcccccccttgagggggag 


7519 


7538 


1 


3 


ccaccgcaacacaa c 


3530 


3549 


6586 


aagaggctccaccagtgga 


6215 


6234 


1 


3 


cacaa c cc gg ga 


3540 


3559 


6587 


gtcgtcggagtcgtgtgtg 


6020 


6039 


1 


3 


ggctggccggcgccccccg 


3671 


3690 


6588 


cgggttgttgcaaacagcc 


5542 


5561 


1 


3 


ccccggggcgcgttccctg 


368S 


3704 


6589 


caggtttgtaactccgggg 


4840 


4858 


1 


3 


tccctgacaccatgcacct 


3692 


3717 


6590 


aggtcacgcgggtggggga 


6618 


6637 


1 


3 




3762 


3781 


6591 


ccccgttgagtccatggaa 


3931 


3950 


1 


3 


ctcccccaggcctgtctcc 


3802 


3821 


6592 


ggagacatcgggccaggag 


9111 


9130 


1 


3 




3904 


3923 


6593 


caccctgcctgggaacccc 


5680 


5698 


1 


3 


mg^ccccgttgTgtcca 9 ^ 


3926 


3945 


6594 


Iggagaccttctgggcaaa 


5613 


5632 


1 


3 


ccgtaccgcaaacattcca 


3996 


4015 


6595 


tggattgccaaatctacgg 


8940 


895S 


1 


3 


caagtggcccatctacacg 


401: 


4032 


6596 


cgtgggtaggatcatcttg 


5389 


5408 


1 


3 


cacgctcccactggcagcg 


402? 


4047 


6597 


cgctgcttcggctttegtg 


5815 


5834 


1 


3 


ccgcatatgcggcccaagg 


406£ 


4087 


6598 


ccttcaaggtcatgagcgg 


5937 


5956 


1 


3 


cgtatatgtctaaagcaca 


41 4C 


41 5S 


6598 


tgtggaagtgtctcatacg 


5163 


5182 


1 


3 


gtatatgtctaaagcacat 


4141 


41 6C 


6600 


atgtggaagtgtctcatac 


5162 


5181 


1 


3 


ggaccattaccacgggcgc 


4191 


421 C 


6601 


gcgcgtgtcactcaggtcc 


6167 


6186 


1 


3 


cccccattacgtactccac 


4205 


422£ 


6602 


gtgggcccgggagaggggg 


6059 


607£ 


1 


3 


agttccttgccgacggtgg 


4236 


425E 


SEQ ID NO 


6602 


ccacagtcaaggctaaact 


7838 


785£ 


1 


3 


gagacggctggagcgcggc 


4355 


4371 


SEQ ID NO 


6604 


gccgggggaccccgatctc 


7537 


7556 


1 


3 
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SEQIDNO: 6058 
SEQIDNO: 6060 
SEQIDNO: 6061 
SEQIDNO: 6062 
SEQIDNO: 6063 
SEQIDNO: 6064 
SEQIDNO: 6065 
SEQIDNO: 6066 
SEQIDNO: 6067 
SEQIDNO: 6068 
SEQIDNO: 6069 


caccgctacgcctccagga 


4384 


440C 


^EQ ID NO: 660f 


tcctacacatggacaggtg 


761 £ 


763! 




C 




tggagagatccccttctac 


445C 


4472 


SEQ ID NO: 6606 


gtagcagtgctcacttcca 


684J 


686' 




C 




agccatccccatcgaagcc 


4477 


4496 


SEQ ID NO: 660" 


ggctggttcgttgciggct 


9257 


927C 




: 




tccccatcgaagccatcaa 


4482 


4501 


SEQ ID NO: 6606 


ttgagggggagccggggga 


7527 


754f 




; 




ccccatcgaagccatcaag 


4482 


4502 


SEQ ID NO: 660S 


cttgagggggagccggggg 


7526 


754J 




: 




ggcctcggaatcaatgctg 


4568 


4587 


SEQIDNO: 661 C 


cagctccgaattgtcggcc 


7414 


743C 




c 




gtccgtcalaccgaccagc 


4612 


463' 


SEQIDNO: 6611 


gctgagggatgtttgggac 


6271 


629C 


1 


c 




gtcataccgaccagcggag 


4616 


463E 


SEQ ID NO: 6612 


ctccattgagccacttgac 


8968 


898" 




: 




cgggctataccggtgactt 


4668 


4687 


SEQ ID NO: 661C 


aagtccaagaagttccccg 


7184 


720C 


1 


c 




ctttgattcagtgatcgac 


4684 


4703 


SEQIDNO: 6614 


gtcgagttcctggtaaaag 


8213 


8232 


1 


c 




acagtcgacttcagcttgg 


4724 


4743 


SEQIDNO: 661 £ 


ccaaatctacggggGctgt 


8947 


896E 


1 




SEQIDNO: 6U/U 
SEQ ID NO: 6071 
SEQ ID NO: 6072 
SEQIDNO: 6073 
SEQIDNO: 6074 
SEQIDNO: 6075 
SEQ ID NO: 6076 
SEQIDNO: 6077 
SEQIDNO: 6078 
SEQIDNO: 6079 
SEQIDNO: 6080 
SEQIDNO: 6081 
SEQIDNO: 6082 
SEQIDNO: 6083 
SEQIDNO: 6084 
SEQIDNO: 6085 
SEQ ID NO: : 6086 


cttggaccccaccttcacc 


4738 


4757 


SEQ ID NO: 6616 


ggtgttgagtgacttcaag 


6301 


632C 


1 




gagacgacgaccgtgcccc 


4760 


4779 


SEQ ID NO: 6617 


ggggacaaccgatcgtctc 


1891 


191C 


1 


3 


ggggtaggactggcagggg 


4806 


4825 


SEQIDNO: 6618 


ccccccggggacttgcccc 


8657 


8676 


1 




gggcatatacaggtttgta . 


4831 


4850 


SEQIDNO: 6619 


tacacatggacaggtgccc 


7622 


7641 


1 


3 


gggggaacggccctcgggc 


4855 


4874 


SEQ ID NO: 662C 


gcccctgcacgccttcccc 


6546 


6566 


1 


3 


tgacgcgggctgtgcttgg 


4906 


4925 


SEQ ID NO: 6621 


ccaattgacaccaccgtca 


8009 


8028 


1 


3 


gacgcgggctgtgcttggt 


4907 


4926 


SEQ ID NO: 6622 


accaattgacaccaccgtc 


8008 


8027 


1 


3 


tgcttggtacgagctcacc 


4918 


4937 


SEQ ID NO: 6623 


ggtgcggctgttggcagca 


5849 


5868 


1 


3 


tgcccacttcctgtcccag 


5050 


5069 


SEQ ID NO: 6624 


ctgggcgcgctgacgggca 


3164 


3183 


1 


3 


ggtggcataccaagccaca 


5101 


5120 


SEQ ID NO: 6625 


tgtgacaccaattgacacc 


8002 


8021 


1 


3 




gggctcaggccccacctcc 


5130 


5149 


SEQ ID NO: 66 26 


ggaggccgcaagccagccc 


8066 


8085 


1 


3 


ccatcgtgggatcaaatgt 


5147 


5166 


SEQ ID NO: 6627 


acattctggcgggctatgg 


5892 


5911 


1 


3 


tcatacggctaaaacccac 


5175 


5194 


SEQ ID NO: 6628 


gtggccttcaaggtcatga 


5933 


5952 


1 


3 


tgctgtataggctaggggc 


5214 


5233 


SEQ ID NO: 6629 


gcccgaaccggacgtagca 


6832 


6851 


1 


3 


ccaaatacatcatggcatg 


5268 


5287 


SEQ ID NO: 6630 


catgcctcaggaaacttgg 


9072 


9091 


1 


3 


ggagtcctcgcagctctgg 


5336 


535^ 


SEQ ID NO: 6631 


ccagctgtctgcg ccctcc 


6955 


6974 


1 


3 


gcctgacaacaggcagtgt 


5364 


5383 


SEQ ID NO: 6632 


acactccaggccaataggc 


9401 


9420 


1 


3 


SEQIDNO: 6087 
SEQIDNO: 6088 
SEQIDNO: 6089 
SEQ ID NO: . 6090 
SEQIDNO: 6091 
SEQIDNO: 6092 
SEQIDNO: 6093 
SEQIDNO: 6094 
SEQIDNO: 6095 
SEQIDNO: 6096 
ocn m mo RDQ7 
SEQIDNO: 6098 
SEQIDNO: 6099 
SEQIDNO: 6100 
SEQIDNO: 6101 
SEQIDNO: 6102 
SEQ ID NO: 6103 
SEQIDNO: 6104 
SEQIDNO: 6105 




5557 


5576 


SEQ ID NO: 6633 


ctccagttaactcctggct 


8820 


8839 




3 


catgtggaatttcatcagc 


5635 


5654 


SEQ ID NO: 6634 


gctgcgccatcacaacatg 


7702 


7721 


1 


3 




5728 


5747 


SEQ ID NO: 6635 


gagccgcatgactgcagag 


9565 


9584 


1 


3 




cccagaacaccctcctgtt 


5751 


5770 


SEQ ID NO: 6636 


aacatcttgggggggtggg 


5771 


5790 


1 


3 


ctcctgtttaacatcttgg 


5762 


5781 


SEQ ID NO: 6637 


ccaatcgatgaacggggag 


9378 


9397 


1 


3 


tgggggggtgggtagccg 


5777 


5796 


SEQ ID NO: 6638 


cggcgccaaactattccaa 


6564 


6583 


1 


3 


gcttcggctttcgtgggc 


5818 


5837 


SEQ ID NO: 6639 


gcccgaaccggacgtagca 


6832 


6851 


1 


3 


cgtgggcgctggtatcgc 


5829 


5848 


SEQ ID NO: 6640 


gcgagcggcgtgctgacga 


8453 


8472 


1 


3 




5845 


5864 


SEQ ID NO: 6641 


gccacgacatcccgcagcg 


7727 


7746 


1 


3 


sggctgttggcagcatagg 


5853 


5872 


SEQ ID NO: 6642 


cctagactctttcgagccg 


7111 


7130 


1 


3 


ggggcaggggtggctggcg 






SEQ ID NO: 6643 


cgcccaactcgctcccccc 


5794 


5813 


1 


3 


stggcgcgctcgtggcctt 


5922 


5941 


SEQ ID NO: 6644 


aagggaggccgcaagccag 


8063 


8082 


1 


3 


ggcgcgctcgtggccttc 


5923 


5942 


SEQ ID NO: 6645 


gaagggaggccgcaagcca 


8062 


8081 


1 


3 


gagcggcgaggcgccctct 


5950 


5969 


SEQ ID NO: 6646 


agagcgtcgtctgctgctc 


7596 


7615 




3 


gggcccgggagagggggG 


6060 


6079 


SEQ ID NO: 6647 


gcccatctacacgctccca 


4019 


4038 




3 


Dggctgatagcgttcgctt 


6095 


6114 


SEQ ID MO: 6648 


aagcaggcggaggctgccg 


5564 


5583 


1 


3 


gtgcctgagagcgacgccg 


6146 


6165 


SEQ ID NO: 6649 


oggccgGcgacagcggcac 


7428 


7447 


1 


3 


atgaggactgttctacgcc 


6237 


6256 


SEQ ID NO: 6650 


ggcggggggacggcatcat 


6399 


6418 


1 


3 


gtccaagctcctgccgcgg 


6331 


6350 


SEQ ID MO: 6651 


scgctccgtgtgggaggsc 


7969| 7988 


1 


3 
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SEQIDNO: 6106 
SEQIDNO: 6107 
SEQIDNO: 6108 
SEQIDNO: 6109 
SEQIDNO: 6110 
SEQIDNO: 6111 
SEQIDNO: 6112 
SEQIDNO: 6113 
SEQIDNO: 6114 
SEQIDNO: 6115 
SEQIDNO: 6116 
SEQIDNO: 6117 
SEQIDNO: 6118 
SEQIDNO: 6119 
SEQIDNO: 6120 
SEQIDNO: 6121 
SEQ ID NO: 6122 
SEQ ID NO: 6123 
SEQIDNO: 6124 
SEQIDNO: 6125 
SEQIDNO: 6126 
SEQIDNO: 6127 
SEQIDNO: 6128 
SEQIDNO: 6129 
SEQIDNO: 6130 
SEQIDNO: 6131 
SEQIDNO: 6132 
SEQIDNO: 6133 
SEQIDNO: 6134 


acagatcgccggacatgtc 


6442 


6461 


SEQ ID NO: 6652 


gacatatatcacagcctgt 


928" 


9306 


1 


3 


acgtggcatggaacattcc 


6506 


652E 


SEQ ID NO: 6652 


ggaagaacccggactacgt 


725" 


7276 


1 


3 


gggcccctgcacgccttcc 


6544 


656C 


SEQ ID NO: 6654 


ggaagaaagcaagctgccc 


766C 


767S 


1 


3 


agtgcccatgtcaggttcc 


667E 


6694 


SEQ ID NO: 665E 


ggaaacagctagacacact 


8803 


8822 




3 


tgcccatgtcaggttccag 


6677 


6696 


SEQ ID NO: 6656 


ctgggcgcgctgacgggca 


3164 


3183 




3 


cagctcctgagtttttcac 


6693 


6712 


SEQ ID NO: 6657 


gtgagagcgtcgtctgctg 


7592 


7612 






tcacggaggtggatggggt 


6708 


6727 


SEQ ID NO: 6658 


acccttcctcaagccgtga 


8152 


8172 




3 


cacggaggtggatggggtg 


6709 


6728 


SEQ ID NO: 6659 


cacccttcctcaagccgtg 


8152 






3 


gacccctcccacattacag 


6872 


6891 


SEQ ID NO: 6660 


ctgttttgactcaacggtc 


8278 








ttggccagggggtctcccc 


6911 


6930 


SEQ ID NO: 666' 


ggggtgggtagccgcccaa 


5782 




— ■ 


— t 


ccttgagggcgacatgcac 


6972 


6991 


SEQ ID NO: 6662 


gtgcttaaggagatgaagg 


7811 


7830 


■I 


3 


ggagatgggcggaaacatc 


7060 


7079 


SEQ ID NO: 6663 


gatgacccatttcttctcc 


8887 








gagatgggcggaaacatca 


7061 


7080 


SEQ ID NO: 6664 


tgatgacccatttcttctc 


8886 


"Ho^ 




— | 


ctagactctttcgagccgc 


7112 


7131 


SEQ ID NO: 6665 


gcggcgtgctgacgactag 


8457 


8476 




— - 


tagactctttcgagccgct 


7113 


7132 


SEQ ID NO: 6666 


agcgacgggtcttggtcta 


7556 


7575 




g 


agaatgaaatatccattgc 


7149 


7168 


SEQ ID NO: 6667 


gcaaagaatgaggttttct 


8030 


8049 


1 


3 


ttgcggcggagatcctgcg 


7164 


7183 


SEQ ID NO: 6668 


cgcacgatgcatctggcaa 


8730 


8749 


-] 


3 


agcgaggaggctggtgaga 


7580 


7599 


SEQ ID NO: 6669 


ctcgtgcccgaccccgct 


9305 


9324 


1 




gagagcgtcgtctgctgc 


7594 


7613 


SEQ ID NO: 6670 


gcagtaaagaccaagctca 


9197 


9216 


1 


3 


gtcgtctgctgctcaatgt 


7601 


7620 


SEQ ID NO: 6671 


acatggtctacgccacgac 


7716 


7735 


1 


3 


gcgccatcacaacatggt 


7704 


7723 


SEQ ID NO: 6672 


accatgtctcccccacgca 


6123 


6142 


1 


3 


cagaagaaggtcacctttg 


7757 


7776 


SEQ ID NO: 6673 


caaagaatgaggttttctg 


8031 


8050 


1 


3 


cctggatgaccattaccgg 


7789 


7808 


SEQ ID NO: 6674 


ccggaacctatccagcagg 


7936 


7955 


1 


3 


ggacgtgcttaaggagatg 


7807 


7826 


SEQ ID NO: 6675 


catcgggccaggagcgtcc 


9116 


9135 


1 


3 


aaagaatgaggttttctgc 


8032 


8051 


SEQ ID NO: 6676 


gcagaagaaggtcaccttt 


7756 


7775 


1 


3 


agttcgtgtatgcgagaag 


8110 


8129 


SEQ ID NO: 6677 


Dttcatgcctcaggaaact 


9069 


9088 


1 


3 


jgctataaaatcgctcaca 


8365 


8384 


SEQ ID NO: 6678 


gtgaaaggtccgtgagcc 


9551 


9570 


1 


3 


tctccatccttctagctc 


8900 


8919 


SEQ ID NO: 6679 


gagcggagggggatgagaa 


7134 


7153 


1 


3 




9303 


9322 


SEQ ID NO: 6680 


sggggcgcgttccctgaca 


3688 


3707 


1 


3 
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Table 14. Sequences from human hepatitis C virus fHCV) fDirect Match Type) 





Source 


Start 
Index 


End 
Index 




Match 


Start 
Index 


End 

Index 


Match 
# 


SEQ ID NO: 6755 


ttttttttttttttttttt 


9446 


9465 


SEQ ID NO:6758 


ttttttttttttttttttt 


9466 


9485 


2 


SEQ ID NO: 6756 


ttttttttttttttttttt 


9446 


9465 


SEQ ID NO:6759 


ttttttttttttttttttt 


9465 


9484 


1 


SEQ ID NO: 6757 


ttttttttttttttttttt 


9447 


9466 


SEQ ID NO:6760 


ttttttttttttttttttt 


9466 


9485 


1 



Table 15. Sequences of Exemplary Gene Targets 

gi 14502152 | ref 1 NM_000384 . 1 | Homo sapiens apolipoprotein B (including Ag(x) 
antigen) (APOB) , mRNA 

ATTCCCACCGGGACCTGCGGGGCTGAGTGCCCTTCTCGGTTGCTGCCGCTGAGGAGCCCGCCCAGCCAGC 
CAGGGCCGCGAGGCCGAGGCCAGGCCGCAGCCCAGGAGCCGCCCCACCGCAGCTGGCGATGGACCCGCCG 
AGGCCCGCGCTGCTGGCGCTGCTGGCGCTGCCTGCGCTGCTGCTGCTGCTGCTGGCGGGCGCCAGGGCCG 
AAGAGGAAATGCTGGAAAATGTCAGCCTGGTCTGTCCAAAAGATGCGACCCGATTCAAGCACCTCCGGAA 
GTACACATACAACTATGAGGCTGAGAGTTCCAGTGGAGTCCCTGGGACTGCTGATTCAAGAAGTGCCACC 
AGGATCAACTGCAAGGTTGAGCTGGAGGTTCCCCAGCTCTGCAGCTTCATCCTGAAGACCAGCCAGTGCA 

ccctgaaagaggtgtatggcttcaaccctgagggcaaAgccttgctgaagaaaaccaagaactctgagga 
gtttgctgcagccatgtccaggtatgagctcaagctggccattccagaagggaagcaggttttcctttac 

CCGGAGAAAGATGAACCTACTTACATCCTGAACATCAAGAGGGGCATCATTTCTGCCCTCCTGGTTCCCC 
CAGAGACAGAAGAAGCCAAGCAAGTGTTGTTTCTGGATACCGTGTATGGAAACTGCTCCACTCACTTTAC 
CGTCAAGACGAGGAAGGGCAATGTGGCAACAGAAATATCCACTGAAAGAGACCTGGGGCAGTGTGATCGC 
TTCAAGCCCATCCGCACAGGCATCAGCCCACTTGCTCTCATCAAAGGCATGACCCGCCCCTTGTCAACTC 
TGATCAGCAGCAGCCAGTCCTGTCAGTACACACTGGACGCTAAGAGGAAGCATGTGGCAGAAGCCATCTG 
CAAGGAGCAACACCTCTTCCTGCCTTTCTCCTACAACAATAAGTATGGGATGGTAGCACAAGTGACACAG 
ACTTTGAAACTTGAAGACACACCAAAGATCAACAGCCGCTTCTTTGGTGAAGGTACTAAGAAGATGGGCC 
TCGCATTTGAGAGCACCAAATCCACATCACCTCCAAAGCAGGCCGAAGCTGTTTTGAAGACTCTCCAGGA 
ACTGAAAAAACTAACCATCTCTGAGCAAAATATCCAGAGAGCTAATCTCTTCAATAAGCTGGTTACTGAG 
CTGAGAGGCCTCAGTGATGAAGCAGTCACATCTCTCTTGCCACAGCTGATTGAGGTGTCCAGCCCCATCA 
CTTTACAAGCCTTGGTTCAGTGTGGACAGCCTCAGTGCTCCACTCACATCCTCCAGTGGCTGAAACGTGT 
GCATGCCAACCCCCTTCTGATAGATGTGGTCACCTACCTGGTGGCCCTGATCCCCGAGCCCTCAGCACAG 
CAGCTGCGAGAGATCTTCAACATGGCGAGGGATCAGCGCAGCCGAGCCACCTTGTATGCGCTGAGCCACG 
CGGTCAACAACTATCATAAGACAAACCCTACAGGGACCCAGGAGCTGCTGGACATTGCTAATTACCTGAT 
GGAACAGATTCAAGATGACTGCACTGGGGATGAAGATTACACCTATTTGATTCTGCGGGTCATTGGAAAT 
ATGGGCCAAACCATGGAGCAGTTAACTCCAGAACTCAAGTCTTCAATCCTCAAATGTGTCCAAAGTACAA 
AGCCATCACTGATGATCCAGAAAGCTGCCATCCAGGCTCTGCGGAAAATGGAGCCTAAAGACAAGGACCA 
GGAGGTTCTTCTTCAGACTTTCCTTGATGATGCTTCTCCGGGAGATAAGCGACTGGCTGCCTATCTTATG 
TTGATGAGGAGTCCTTCACAGGCAGATATTAACAAAATTGTCCAAATTCTACCATGGGAACAGAATGAGC 
AAGTGAAGAACTTTGTGGCTTCCCATATTGCCAATATCTTGAACTCAGAAGAATTGGATATCCAAGATCT 
GAAAAAGTTAGTGAAAGAAGCTCTGAAAGAATCTCAACTTCCAACTGTCATGGACTTCAGAAAATTCTCT 
CGGAACTATCAACTCTACAAATCTGTTTCTCTTCCATCACTTGACCCAGCCTCAGCCAAAATAGAAGGGA 
ATCTTATATTTGATCCAAATAACTACCTTCCTAAAGAAAGCATGCTGAAAACTACCCTCACTGCCTTTGG 
ATTTGCTTCAGCTGACCTCATCGAGATTGGCTTGGAAGGAAAAGGCTTTGAGCCAACATTGGAAGCTCTT 
TTTGGGAAGCAAGGATTTTTCCCAGACAGTGTCAACAAAGCTTTGTACTGGGTTAATGGTCAAGTTCCTG 
ATGGTGTCTCTAAGGTCTTAGTGGACCACTTTGGCTATACCAAAGATGATAAACATGAGCAGGATATGGT 
AAATGGAATAATGCTCAGTGTTGAGAAGCTGATTAAAGATTTGAAATCCAAAGAAGTCCCGGAAGCCAGA 
GCCTACCTCCGCATCTTGGGAGAGGAGCTTGGTTTTGCCAGTCTCCATGACCTCCAGCTCCTGGGAAAGC 
TGCTTCTGATGGGTGCCCGCACTCTGCAGGGGATCCCCCAGATGATTGGAGAGGTCATCAGGAAGGGCTC 
AAAGAATGACTTTTTTCTTCACTACATCTTCATGGAGAATGCCTTTGAACTCCCCACTGGAGCTGGATTA 
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CAGTTGCAAATATCTTCATCTGGAGTCATTGCTCCCGGAGCCAAGGCTGGAGTAAAACTGGAAGTAGCCA 
ACATGCAGGCTGAACTGGTGGCAAAACCCTCCGTGTCTGTGGAGTTTGTGACAAATATGGGCATCATCAT 
TCCGGACTTCGCTAGGAGTGGGGTCCAGATGAACACCAACTTCTTCCACGAGTCGGGTCTGGAGGCTCAT 
GTTGCCCTAAAAGCTGGGAAGCTGAAGTTTATCATTCCTTCCCCAAAGAGACCAGTCAAGCTGCTCAGTG 
GAGGCAACACATTACATTTGGTCTCTACCACCAAAACGGAGGTGATCCCACCTCTCATTGAGAACAGGCA 
GTCCTGGTCAGTTTGCAAGCAAGTCTTTCCTGGCCTGAATTACTGCACCTCAGGCGCTTACTCCAACGCC 
AGCTCCACAGACTCCGCCTCCTACTATCCGCTGACCGGGGACACCAGATTAGAGCTGGAACTGAGGCCTA 
CAGGAGAGATTGAGCAGTATTCTGTCAGCGCAACCTATGAGCTCCAGAGAGAGGACAGAGCCTTGGTGGA 
TACCCTGAAGTTTGTAACTCAAGCAGAAGGTGCGAAGCAGACTGAGGCTACCATGACATTCAAATATAAT 
CGGCAGAGTATGACCTTGTCCAGTGAAGTCCAAATTCCGGATTTTGATGTTGACCTCGGAACAATCCTCA 
GAGTTAATGATGAATCTACTGAGGGCAAAACGTCTTACAGACTCACCCTGGACATTCAGAACAAGAAAAT 
TACTGAGGTCGCCCTCATGGGCCACCTAAGTTGTGACACAAAGGAAGAAAGAAAAATCAAGGGTGTTATT 
TCCATACCCCGTTTGCAAGCAGAAGCCAGAAGTGAGATCCTCGCCCACTGGTCGCCTGCCAAACTGCTTC 
TCCAAATGGACTCATCTGCTACAGCTTATGGCTCCACAGTTTCCAAGAGGGTGGCATGGCATTATGATGA 
AGAGAAGATTGAATTTGAATGGAACACAGGCACCAATGTAGATACCAAAAAAATGACTTCCAATTTCCCT 
GTGGATCTCTCCGATTATCCTAAGAGCTTGCATATGTATGCTAATAGACTCCTGGATCACAGAGTCCCTG 
AAACAGACATGACTTTCCGGCACGTGGGTTCCAAATTAATAGTTGCAATGAGCTCATGGCTTCAGAAGGC 
ATCTGGGAGTCTTCCTTATACCCAGACTTTGCAAGACCACCTCAATAGCCTGAAGGAGTTCAACCTCCAG 
AACATGGGATTGCCAGACTTCCACATCCCAGAAAACCTCTTCTTAAAAAGCGATGGCCGGGTCAAATATA 
CCTTGAACAAGAACAGTTTGAAAATTGAGATTCCTTTGCCTTTTGGTGGCAAATCCTCCAGAGATCTAAA 
GATGTTAGAGACTGTTAGGACACCAGCCCTCCACTTCAAGTCTGTGGGATTCCATCTGCCATCTCGAGAG 
TTCCAAGTCCCTACTTTTACCATTCCCAAGTTGTATCAACTGCAAGTGCCTCTCCTGGGTGTTCTAGACC 
TCTCCACGAATGTCTACAGCAACTTGTACAACTGGTCCGCCTCCTACAGTGGTGGCAACACCAGCACAGA 
CCATTTCAGCCTTCGGGCTCGTTACCACATGAAGGCTGACTCTGTGGTTGACCTGCTTTCCTACAATGTG 
CAAGGATCTGGAGAAACAACATATGACCACAAGAATACGTTCACACTATCATGTGATGGGTCTCTACGCC 
ACAAATTTCTAGATTCGAATATCAAATTCAGTCATGTAGAAAAACTTGGAAACAACCCAGTCTCAAAAGG 
TTTACTAATATTCGATGCATCTAGTTCCTGGGGACCACAGATGTCTGCTTCAGTTCATTTGGACTCCAAA 
AAGAAACAGCATTTGTTTGTCAAA.GAAGTCAAGATTGATGGGCAGTTCAGAGTCTCTTCGTTCTATGCTA 
AAGGCACATATGGCCTGTCTTGTCAGAGGGATCCTAACACTGGCCGGCTCAATGGAGAGTCCAACCTGAG 
GTTTAACTCCTCCTACCTCCAAGGCACCAACCAGATAACAGGAAGATATGAAGATGGAACCCTCTCCCTC 
ACCTCCACCTCTGATCTGCAAAGTGGCATCATTAAAAATACTGCTTCCCTAAAGTATGAGAACTACGAGC 
TGACTTTAAAATCTGACACCAATGGGAAGTATAAGAACTTTGCCACTTCTAACAAGATGGATATGACCTT 
CTCTAAGCAAAATGCACTGCTGCGTTCTGAATATCAGGCTGATTACGAGTCATTGAGGTTCTTCAGCCTG 
CTTTCTGGATCACTAAATTCCCATGGTCTTGAGTTAAATGCTGACATCTTAGGCACTGACAAAATTAATA 
GTGGTGCTCACAAGGCGACACTAAGGATTGGCCAAGATGGAATATCTACCAGTGCAACGACCAACTTGAA 
GTGTAGTCTCCTGGTGCTGGAGAATGAGCTGAATGCAGAGCTTGGCCTCTCTGGGGCATCTATGAAATTA 
ACAACAAATGGCCGCTTCAGGGAACACAATGCAAAATTCAGTCTGGATGGGAAAGCCGCCCTCACAGAGC 
TATCACTGGGAAGTGCTTATCAGGCCATGATTCTGGGTGTCGACAGCAAAAACATTTTCAACTTCAAGGT 
CAGTCAAGAAGGACTTAAGCTCTCAAATGACATGATGGGCTCATATGCTGAAATGAAATTTGACCACACA 
AACAGTCTGAACATTGCAGGCTTATCACTGGACTTCTCTTCAAAACTTGACAACATTTACAGCTCTGACA 
AGTTTTATAAGCAAACTGTTAATTTACAGCTACAGCCCTATTCTCTGGTAACTACTTTAAACAGTGACCT 
GAAATACAATGCTCTGGATCTCACCAACAATGGGAAACTACGGCTAGAACCCCTGAAGCTGCATGTGGCT 
GGTAACCTAAAAGGAGCCTACCAAAATAATGAAATAAAACACATCTATGCCATCTCTTCTGCTGCCTTAT 
CAGCAAGCTATAAAGCAGACACTGTTGCTAAGGTTCAGGGTGTGGAGTTTAGCCATCGGCTCAACACAGA 
CATCGCTGGGCTGGCTTCAGCCATTGACATGAGCACAAACTATAATTCAGACTCACTGCATTTCAGCAAT 
GTCTTCCGTTCTGTAATGGCCCCGTTTACCATGACCATCGATGCACATACAAATGGCAATGGGAAACTCG 
CTCTCTGGGGAGAACATACTGGGCAGCTGTATAGCAAATTCCTGTTGAAAGCAGAACCTCTGGCATTTAC 
TTTCTCTCATGATTACAAAGGCTCCACAAGTCATCATCTCGTGTCTAGGAAAAGCATCAGTGCAGCTCTT 
GAACACAAAGTCAGTGCCCTGCTTACTCCAGCTGAGCAGACAGGCACCTGGAAACTCAAGACCCAATTTA 
ACAACAATGAATACAGCCAGGACTTGGATGCTTACAACACTAAAGATAAAATTGGCGTGGAGCTTACTGG 
ACGAACTCTGGCTGACCTAACTCTACTAGACTCCCCAATTAAAGTGCCACTTTTACTCAGTGAGCCCATC 
AATATCATTGATGCTTTAGAGATGAGAGATGCCGTTGAGAAGCCCCAAGAATTTACAATTGTTGCTTTTG 
TAAAGTATGATAAAAACCAAGATGTTCACTCCATTAACCTCCCATTTTTTGAGACCTTGCAAGAATATTT 
TGAGAGGAATCGACAAACCATTATAGTTGTAGTGGAAAACGTACAGAGAAACCTGAAGCACATCAATATT 
GATCAATTTGTAAGAAAATACAGAGCAGCCCTGGGAAAACTCCCACAGCAAGCTAATGATTATCTGAATT 
CATTCAATTGGGAGAGACAAGTTTCACATGCCAAGGAGAAACTGACTGCTCTCACAAAAAAGTATAGAAT 
T ACAG AAAAT GATAT ACAAAT T GC ATTAG ATGATGCCAAAAT CAAC T T T AAT G AAAAAC TAT CT C AACT G 
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CAGACATATATGATACAATTTGATCAGTATATTAAAGATAGTTATGATTTACATGATTTGAAAATAGCTA 
TTGCTAATATTATTGATGAAATCATTGAAAAATTAAAAAGTCTTGATGAGCACTATCATATCCGTGTAAA 
TTTAGTAAAAACAATCCATGATCTACATTTGTTTATTGAAAATATTGATTTTAACAAAAGTGGAAGTAGT 
ACTGCATCCTGGATTCAAAATGTGGATACTAAGTACCAAATCAGAATCCAGATACAAGAAAAACTGCAGC 
5 AGCTTAAGAGACACATACAGAATATAGACATCCAGCACCTAGCTGGAAAGTTAAAACAACACATTGAGGC 
TATTGATGTTAGAGTGCTTTTAGATCAATTGGGAACTACAATTTCATTTGAAAGAATAAATGATGTTCTT 
GAGCATGTCAAACACTTTGTTATAAATCTTATTGGGGATTTTGAAGTAGCTGAGAAAATCAATGCCTTCA 
GAGCCAAAGTCCATGAGTTAATCGAGAGGTATGAAGTAGACCAACAAATCCAGGTTTTAATGGATAAATT 
AGTAGAGTTGACCCACCAATACAAGTTGAAGGAGACTATTCAGAAGCTAAGCAATGTCCTACAACAAGTT 

10 AAGATAAAAGATTACTTTGAGAAATTGGTTGGATTTATTGATGATGCTGTGAAGAAGCTTAATGAATTAT 
CTTTTAAAACATTCATTGAAGATGTTAACAAATTCCTTGACATGTTGATAAAGAAATTAAAGTCATTTGA 
TTACCACCAGTTTGTAGATGAAACCAATGACAAAATCCGTGAGGTGACTCAGAGACTCAATGGTGAAATT 
CAGGCTCTGGAACTACCACAAAAAGCTGAAGCATTAAAACTGTTTTTAGAGGAAACCAAGGCCACAGTTG 
CAGTGTATCTGGAAAGCCTACAGGACACCAAAATAACCTTAATCATCAATTGGTTACAGGAGGCTTTAAG 

15 TTCAGCATCTTTGGCTCACATGAAGGCCAAATTCCGAGAGACTCTAGAAGATACACGAGACCGAATGTAT 
CAAATGGACATTCAGCAGGAACTTCAACGATACCTGTCTCTGGTAGGCCAGGTTTATAGCACACTTGTCA 
CCTACATTTCTGATTGGTGGACTCTTGCTGCTAAGAACCTTACTGACTTTGCAGAGCAATATTCTATCCA 
AGATTGGGCTAAACGTATGAAAGCATTGGTAGAGCAAGGGTTCACTGTTCCTGAAATCAAGACCATCCTT 
GGGACCATGCCTGCCTTTGAAGTCAGTCTTCAGGCTCTTCAGAAAGCTACCTTCCAGACACCTGATTTTA 

20 TAGTCCCCCTAACAGATTTGAGGATTCCATCAGTTCAGATAAACTTCAAAGACTTAAAAAATATAAAAAT 
CCCATCCAGGTTTTCCACACCAGAATTTACCATCCTTAACACCTTCCACATTCCTTCCTTTACAATTGAC 
' TTTGTCGAAATGAAAGTAAAGATCATCAGAACCATTGACCAGATGCAGAACAGTGAGCTGCAGTGGCCCG 
TTCCAGATATATATCTCAGGGATCTGAAGGTGGAGGACATTCCTCTAGCGAGAATCACCCTGCCAGACTT 
CCGTTTACCAGAAATCGCAATTCCAGAATTCATAATCCCAACTCTCAACCTTAATGATTTTCAAGTTCCT 

25 GACCTTCACATACCAGAATTCCAGCTTCCCCACATCTCACACACAATTGAAGTACCTACTTTTGGCAAGC 
TATACAGTATTCTGAAAATCCAATCTCCTCTTTTCACATTAGATGCAAATGCTGACATAGGGAATGGAAC 
CACCTCAGCAAACGAAGCAGGTATCGCAGCTTCCATCACTGCCAAAGGAGAGTCCAAATTAGAAGTTCTC 
AATTTTGATTTTCAAGCAAATGCACAACTCTCAAACCCTAAGATTAATCCGCTGGCTCTGAAGGAGTCAG 
TGAAGTTCTCCAGCAAGTACCTGAGAACGGAGCATGGGAGTGAAATGCTGTTTTTTGGAAATGCTATTGA 

30 GGGAAAATGAAACACAGTGGCAAGTTTACACACAGAAAAAAATACACTGGAGCTTAGTAATGGAGTGATT 
GTCAAGATAAACAATCAGCTTACCCTGGATAGCAACACTAAATACTTCCACAAATTGAACATCCCCAAAC 
TGGACTTCTCTAGTCAGGCTGACCTGCGCAACGAGATCAAGACACTGTTGAAAGCTGGCCACATAGCATG 
GACTTCTTCTGGAAAAGGGTCATGGAAATGGGCCTGCCCCAGATTCTCAGATGAGGGAACACATGAATCA 
CAAATTAGTTTCACCATAGAAGGACCCCTCACTTCCTTTGGACTGTCCAATAAGATCAATAGCAAACACC 

35 TAAGAGTAAACCAAAACTTGGTTTATGAATCTGGCTCCCTCAACTTTTCTAAACTTGAAATTCAATCACA 
AGTCGATTCCCAGCATGTGGGCCACAGTGTTCTAACTGCTAAAGGCATGGCACTGTTTGGAGAAGGGAAG 
GCAGAGTTTACTGGGAGGCATGATGCTCATTTAAATGGAAAGGTTATTGGAACTTTGAAAAATTCTCTTT 
TCTTTTCAGCCCAGCCATTTGAGATCACGGCATCCACAAACAATGAAGGGAATTTGAAAGTTCGTTTTCC 
ATTAAGGTTAACAGGGAAGATAGACTTCCTGAATAACTATGCACTGTTTCTGAGTCCCAGTGCCCAGCAA 

40 GCAAGTTGGCAAGTAAGTGCTAGGTTCAATCAGTATAAGTACAACCAAAATTTCTCTGCTGGAAACAACG 
AGAACATTATGGAGGCCCATGTAGGAATAAATGGAGAAGCAAATCTGGATTTCTTAAACATTCCTTTAAC 
AATTCCTGAAATGCGTCTACCTTACACAATAATCACAACTCCTCCACTGAAAGATTTCTCTCTATGGGAA 
AAAACAGGCTTGAAGGAATTCTTGAAAACGACAAAGCAATCATTTGATTTAAGTGTAAAAGCTCAGTATA 
AGAAAAACAAACACAGGCATTCCATCACAAATCCTTTGGCTGTGCTTTGTGAGTTTATCAGTCAGAGCAT 

45 CAAATCCTTTGACAGGCATTTTGAAAAAAACAGAAACAATGCATTAGATTTTGTCACCAAATCCTATAAT 
GAAACAAAAATTAAGTTTGATAAGTACAAAGCTGAAAAATCTCACGACGAGCTCCCCAGGACCTTTCAAA 
TTCCTGGATACACTGTTCCAGTTGTCAATGTTGAAGTGTCTCCATTCACCATAGAGATGTCGGCATTCGG 
CTATGTGTTCCCAAAAGCAGTCAGCATGCCTAGTTTCTCCATCCTAGGTTCTGACGTCCGTGTGCCTTCA 
TACACATTAATCCTGCCATCATTAGAGCTGCCAGTCCTTCATGTCCCTAGAAATCTCAAGCTTTCTCTTC 

50 CACATTTCAAGGAATTGTGTACCATAAGCCATATTTTTATTCCTGCCATGGGCAATATTACCTATGATTT 
CTCCTTTAAATCAAGTGTCATCACACTGAATACCAATGCTGAACTTTTTAACCAGTCAGATATTGTTGCT 
CATCTCCTTTCTTCATCTTCATCTGTCATTGATGCACTGCAGTACAAATTAGAGGGCACCACAAGATTGA 
CAAGAAAAAGGGGATTGAAGTTAGCCACAGCTCTGTCTCTGAGCAACAAATTTGTGGAGGGTAGTCATAA 
CAGTACTGTGAGCTTAACCACGAAAAATATGGAAGTGTCAGTGGCAAAAACCACAAAAGCCGAAATTCCA 
55 ATTTTGAGAATGAATTTCAAGCAAGAACTTAATGGAAATACCAAGTCAAAACCTACTGTCTCTTCCTCCA 
TGGAATTTAAGTATGATTTCAATTCTTCAATGCTGTACTCTACCGCTAAAGGAGCAGTTGACCACAAGCT 
TAGCTTGGAAAGCCTCACCTCTTACTTTTCCATTGAGTCATCTACCAAAGGAGATGTCAAGGGTTCGGTT 
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CTTTCTCGGGAATATTCAGGAACTATTGCTAGTGAGGCCAACACTTACTTGAATTCCAAGAGCACACGGT 
CTTCAGTGAAGCTGCAGGGCACTTCCAAAATTGATGATATCTGGAACCTTGAAGTAAAAGAAAATTTTGC 
TGGAGAAGCCACACTCCAACGCATATATTCCCTCTGGGAGCACAGTACGAAAAACCACTTACAGCTAGAG 
GGCCTCTTTTTCACCAACGGAGAACATACAAGCAAAGCCACCCTGGAACTCTCTCCATGGCAAATGTCAG 
5 CTCTTGTTCAGGTCCATGCAAGTCAGCCCAGTTCCTTCCATGATTTCCCTGACCTTGGCCAGGAAGTGGC 
CCTGAATGCTAACACTAAGAACCAGAAGATCAGATGGAAAAATGAAGTCCGGATTCATTCTGGGTCTTTC 
CAGAGCCAGGTCGAGCTTTCCAATGACCAAGAAAAGGCACACCTTGACATTGCAGGATCCTTAGAAGGAC 
ACCTAAGGTTCCTCAAAAATATCATCCTACCAGTCTATGACAAGAGCTTATGGGATTTCCTAAAGCTGGA 

10 AATGGCTATTCATTCTCCATCCCTGTAAAAGTTTTGGCTGATAAATTCATTACTCCTGGGCTGAAACTAA 
ATGATCTAAATTCAGTTCTTGTCATGCCTACGTTCCATGTCCCATTTACAGATCTTCAGGTTCCATCGTG 
CAAACTTGACTTCAGAGAAATACAAATCTATAAGAAGCTGAGAACTTCATCATTTGCCCTCAACCTACCA 
ACACTCCCCGAGGTAAAATTCCCTGAAGTTGATGTGTTAACAAAATATTCTCAACCAGAAGACTCCTTGA 
TTCCCTTTTTTGAGATAACCGTGCCTGAATCTCAGTTAACTGTGTCCCAGTTCACGCTTCCAAAAAGTGT 

15 TTCAGATGGCATTGCTGCTTTGGATCTAAATGCAGTAGCCAACAAGATCGCAGACTTTGAGTTGCCCACC 
ATCATCGTGCCTGAGCAGACCATTGAGATTCCCTCCATTAAGTTCTCTGTACCTGCTGGAATTGTCATTC 
CTTCCTTTCAAGCACTGACTGCACGCTTTGAGGTAGACTCTCCCGTGTATAATGCCACTTGGAGTGCCAG 
TTTGAAAAACAAAGCAGATTATGTTGAAACAGTCCTGGATTCCACATGCAGCTCAACCGTACAGTTCCTA 
GAATATGAACTAAATGTTTTGGGAACACACAAAATCGAAGATGGTACGTTAGCCTCTAAGACTAAAGGAA 

20 CACTTGCACACCGTGACTTCAGTGCAGAATATGAAGAAGATGGCAAATTTGAAGGACTTCAGGAATGGGA 
AGGAAAAGCGCACCTCAATATCAAAAGCCCAGCGTTCACCGATCTCCATCTGCGCTACCAGAAAGACAAG 
AAAGGCATCTCCACCTCAGCAGCCTCCCCAGCCGTAGGCACCGTGGGCATGGATATGGATGAAGATGACG 
ACTTTTCTAAATGGAACTTCTACTACAGCCCTCAGTCCTCTCCAGATAAAAAACTCACCATATTCAAAAC 
TGAGTTGAGGGTCCGGGAATCTGATGAGGAAACTCAGATCAAAGTTAATTGGGAAGAAGAGGCAGCTTCT 

25 GGCTTGCTAACCTCTCTGAAAGACAACGTGCCCAAGGCCACAGGGGTCCTTTATGATTATGTCAACAAGT 
ACCACTGGGAACACACAGGGCTCACCCTGAGAGAAGTGTCTTCAAAGCTGAGAAGAAATCTGCAGAACAA 
TGCTGAGTGGGTTTATCAAGGGGCCATTAGGCAAATTGATGATATCGACGTGAGGTTCCAGAAAGCAGCC 
AGTGGCACCACTGGGACCTACCAAGAGTGGAAGGACAAGGCCCAGAATCTGTACCAGGAACTGTTGACTC 
AGGAAGGCCAAGCCAGTTTCCAGGGACTCAAGGATAACGTGTTTGATGGCTTGGTACGAGTTACTCAAAA 

30 ATTCCATATGAAAGTCAAGCATCTGATTGACTCACTCATTGATTTTCTGAACTTCCCCAGATTCCAGTTT 
CCGGGGAAACCTGGGATATACACTAGGGAGGAACTTTGCACTATGTTCATAAGGGAGGTAGGGACGGTAC 
TGTCCCAGGTATATTCGAAAGTCCATAATGGTTCAGAAATACTGTTTTCCTATTTCCAAGACCTAGTGAT 
TACACTTCCTTTCGAGTTAAGGAAACATAAACTAATAGATGTAATCTCGATGTATAGGGAACTGTTGAAA 
GATTTATCAAAAGAAGCCCAAGAGGTATTTAAAGCCATTCAGTCTCTCAAGACCACAGAGGTGCTACGTA 

35 ATCTTCAGGACCTTTTACAATTCATTTTCCAACTAATAGAAGATAACATTAAACAGCTGAAAGAGATGAA 
ATTTACTTATCTTATTAATTATATCCAAGATGAGATCAACACAATCTTCAATGATTATATCCCATATGTT 
TTTAAATTGTTGAAAGAAAACCTATGCCTTAATCTTCATAAGTTCAATGAATTTATTCAAAACGAGCTTC 
AGGAAGCTTCTCAAGAGTTACAGCAGATCCATCAATACATTATGGCCCTTCGTGAAGAATATTTTGATCC 
AAGTATAGTTGGCTGGACAGTGAAATATTATGAACTTGAAGAAAAGATAGTCAGTCTGATCAAGAACCTG 

40 TTAGTTGCTCTTAAGGACTTCCATTCTGAATATATTGTCAGTGCCTCTAACTTTACTTCCCAACTCTCAA 
GTCAAGTTGAGCAATTTCTGCACAGAAATATTCAGGAATATCTTAGCATCCTTACCGATCCAGATGGAAA 
AGGGAAAGAGAAGATTGCAGAGCTTTCTGCCACTGCTCAGGAAATAATTAAAAGCCAGGCCATTGCGACG 
AAGAAAATAATTTCTGATTACCACCAGCAGTTTAGATATAAACTGCAAGATTTTTCAGACCAACTCTCTG 
ATTACTATGAAAAATTTATTGCTGAATCCAAAAGATTGATTGACCTGTCCATTCAAAACTACCACACATT 

45 TCTGATATACATCACGGAGTTACTGAAAAAGCTGCAATCAACCACAGTCATGAACCCCTACATGAAGCTT 
GCTCCAGGAGAACTTACTATCATCCTCTAATTTTTTAAAAGAAATCTTCATTTATTCTTCTTTTCCAATT 
GAACTTTCACATAGCACAGAAAAAATTCAAACTGCCTATATTGATAAAACCATACAGTGAGCCAGCCTTG 
CAGTAGGCAGTAGACTATAAGCAGAAGCACATATGAACTGGACCTGCACCAAAGCTGGCACCAGGGCTCG 
GAAGGTCTCTGAACTCAGAAGGATGGCATTTTTTGCAAGTTAAAGAAAATCAGGATCTGAGTTATTTTGC 

50 TAAACTTGGGGGAGGAGGAACAAATAAATGGAGTCTTTATTGTGTATCATA (SEQ ID NO: 6681) 



>gi | 4 557 4 42 1 ref | NM_000078 . 1 | Homo sapiens cholesteryl ester transfer 
protein, plasma (CETP) , mRNA 

GTGAATCTCTGGGGCCAGGAAGACCCTGCTGCCCGGAAGAGCCTCATGTTCCGTGGGGGCTGGGCGGACA 
55 TACATATACGGGCTCCAGGCTGAACGGCTCGGGCCACTTACACACCACTGCCTGATAACCATGCTGGCTG 
CCACAGTCCTGACCCTGGCCCTGCTGGGCAATGCCCATGCCTGCTCCAAAGGCACCTCGCACGAGGCAGG 
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CATCGTGTGCCGCATCACCAAGCCTGCCCTCCTGGTGTTGAACCACGAGACTGCCAAGGTGATCCAGACC 
GCCTTCCAGCGAGCCAGCTACCCAGATATCACGGGCGAGAAGGCCATGATGCTCCTTGGCCAAGTCAAGT 
ATGGGTTGCACAACATCCAGATCAGCCACTTGTCCATCGCCAGCAGCCAGGTGGAGCTGGTGGAAGCCAA 
GTCCATTGATGTCTCCATTCAGAACGTGTCTGTGGTCTTCAAGGGGACCCTGAAGTATGGCTACACCACT 
5 GCCTGGTGGCTGGGTATTGATCAGTCCATTGACTTCGAGATCGACTCTGCCATTGACCTCCAGATCAACA 
CACAGCTGACCTGTGACTCTGGTAGAGTGCGGACCGATGCCCCTGACTGCTACCTGTCTTTCCATAAGCT 
GCTCCTGCATCTCCAZVGGGGAGCGAGAGCCTGGGTGGATCAAGCAGCTGTTCACAAATTTCATCTCCTTC 
ACCCTGAAGCTGGTCCTGAAGGGACAGATCTGCAAAGAGATCAACGTCATCTCTAACATCATGGCCGATT 
TTGTCCAGACAAGGGCTGCCAGCATCCTTTCAGATGGAGACATTGGGGTGGACATTTCCCTGACAGGTGA 

10 TCCCGTCATCACAGCCTCCTACCTGGAGTCCCATCACAAGGGTCATTTCATCTACAAGAATGTCTCAGAG 
GACCTCCCCCTCCCCACCTTCTCGCCCACACTGCTGGGGGACTCCCGCATGCTGTACTTCTGGTTCTCTG 
AGCGAGTCTTCCACTCGCTGGCCAAGGTAGCTTTCCAGGATGGCCGCCTCATGCTCAGCCTGATGGGAGA 
CGAGTTCAAGGCAGTGCTGGAGACCTGGGGCTTCAACACCAACCAGGAAATCTTCCAAGAGGTTGTCGGC 
GGCTTCCCCAGCCAGGCCCAAGTCACCGTCCACTGCCTCAAGATGCCCAAGATCTCCTGCCAAAACAAGG 

15 GAGTCGTGGTCAATTCTTCAGTGATGGTGAAATTCCTCTTTCCACGCCCAGACCAGCAACATTCTGTAGC 
TTACACATTTGAAGAGGATATCGTGACTACCGTCCAGGCCTCCTATTCTAAGAAAAAGCTCTTCTTAAGC 
CTCTTGGATTTCCAGATTACACCAAAGACTGTTTCCAACTTGACTGAGAGCAGCTCCGAGTCCATCCAGA 
GCTTCCTGCAGTCAATGATCACCGCTGTGGGCATCCCTGAGGTCATGTCTCGGCTCGAGGTAGTGTTTAC 
AGCCCTCATGAACAGCAAAGGCGTGAGCCTCTTCGACATCATCAACCCTGAGATTATCACTCGAGATGGC 

20 TTCCTGCTGCTGCAGATGGACTTTGGCTTCCCTGAGCACCTGCTGGTGGATTTCCTCCAGAGCTTGAGCT 
AGAAGTCTCCAAGGAGGTCGGGATGGGGCTTGTAGCAGAAGGCAAGCACCAGGCTCACAGCTGGAACCCT 
GGTGTCTCCTCCAGCGTGGTGGAAGTTGGGTTAGGAGTACGGAGATGGAGATTGGCTCCCAACTCCTCCC 
TATCCTAAAGGCCCACTGGCATTAAAGTGCTGTATCCAAG (SEQ ID NO: 6682) 



25 

>gi|414668|emb|X75500.1|HSMTP H. sapiens mRNA for microsomal triglyceride 
transfer protein 

TGCAGTTGAGGATTGCTGGTCAATATGATTCTTCTTGCTGTGCTTTTTCTCTGCTTCATTTCCTCATATT 
CAGCTTCTGTTAAAGGTCACACAACTGGTCTCTCATTAAATAATGACCGGCTGTACAAGCTCACGTACTC 

30 CACTGAAGTTCTTCTTGATCGGGGCAAAGGAAAACTGCAAGACAGCGTGGGCTACCGCATTTCCTCCAAC 
GTGGATGTGGCCTTACTATGGAGGAATCCTGATGGTGATGATGACCAGTTGATCCAAATAACGATGAAGG 
ATGTAAATGTTGAAAATGTGAATCAGCAGAGAGGAGAGAAGAGCATCTTCAAAGGAAAAAGCCCATCTAA 
AATAATGGGAAAGGAAAACTTGGAAGCTCTGCAAAGACCTACGCTCCTTCATCTAATCCATGGAAAGGTC 
AAAGAGTTCTACTCATATCAAAATGAGGCAGTGGCCATAGAAAATATCAAGAGAGGTCTGGCTAGCCTAT 

35 TTCAGACACAGTTAAGCTCTGGAACCACCAATGAGGTAGATATCTCTGGAAATTGTAAAGTGACCTACCA 
GGCTCATCAAGACAAAGTGATCAAAATTAAGGCCTTGGATTCATGCAAAATAGCGAGGTCTGGATTTACG 
ACCCCAAATCAGGTCTTGGGTGTCAGTTCAAAAGCTACATCTGTCACCACCTATAAGATAGAAGACAGCT 
TTGTTATAGCTGTGCTTGCTGAAGAAACACACAATTTTGGACTGAATTTCCTACAAACCATTAAGGGGAA 
AATAGTATCGAAGCAGAAATTAGAGCTGAAGACAACCGAAGCAGGCCCAAGATTGATGTCTGGAAAGCAG 

40 GCTGCAGCCATAATCAAAGCAGTTGATTCAAAGTACACGGCCATTCCCATTGTGGGGCAGGTCTTCCAGA 
GCCACTGTAAAGGATGTCCTTCTCTCTCGGAGCTCTGGCGGTCCACCAGGAAATACCTGCAGCCTGACAA 
CCTTTCCAAGGCTGAGGCTGTCAGAAACTTCCTGGCCTTCATTCAGCACCTCAGGACTGCGAAGAAAGAA 
GAGATCCTTCAAATACTAAAGATGGAAAATAAGGAAGTATTACCTCAGCTGGTGGATGCTGTCACCTCTG 
CTCAGACCTCAGACTCATTAGAAGCCATTTTGGACTTTTTGGATTTCAAAAGTGACAGCAGCATTATCCT 

45 CCAGGAGAGGTTTCTCTATGCCTGTGGATTTGCTTCTCATCCCAATGAAGAACTCCTGAGAGCCCTCATT 
AGTAAGTTCAAAGGTTCTATTGGTAGCAGTGACATCAGAGAAACTGTTATGATCATCACTGGGACACTTG 
TCAGAAAGTTGTGTCAGAATGAAGGCTGCAAACTCAAAGCAGTAGTGGAAGCTAAGAAGTTAATCCTGGG 
AGGACTTGAAAAAGCAGAGAAAAAAGAGGACACCAGGATGTATCTGCTGGCTTTGAAGAATGCCCTGCTT 
CCAGAAGGCATCCCAAGTCTTCTGAAGTATGCAGAAGCAGGAGAAGGGCCCATCAGCCACCTGGCTACCA 

50 CTGCTCTCCAGAGATATGATCTCCCTTTCATAACTGATGAGGTGAAGAAGACCTTAAACAGAATATACCA 
CCAAAACCGTAAAGTTCATGAAAAGACTGTGCGCACTGCTGCAGCTGCTATCATTTTAAATAACAATCCA 
TCCTACATGGACGTCAAGAACATCCTGCTGTCTATTGGGGAGCTTCCCCAAGAAATGAATAAATACATGC 
TCGCCATTGTTCAAGACATCCTACGTTTTGAAATGCCTGCAAGCAAAATTGTCCGTCGAGTTCTGAAGGA 
AATGGTCGCTCACAATTATGACCGTTTCTCCAGGAGTGGATCTTCTTCTGCCTACACTGGCTACATAGAA 

55 CGTAGTCCCCGTTCGGCATCTACTTACAGCCTAGACATTCTCTACTCGGGTTCTGGCATTCTAAGGAGAA 
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GTAACCTGAACATCTTTCAGTACATTGGGAAGGCTGGTCTTCACGGTAGCCAGGTGGTTATTGAAGCCCA 
AGGACTGGAAGCCTTAATCGCAGCCACCCCTGACGAGGGGGAGGAGAACCTTGACTCCTATGCTGGTATG 
TCAGCCATCCTCTTTGATGTTCAGCTCAGACCTGTCACCTTTTTCAACGGATACAGTGATTTGATGTCCA 
AAATGCTGTCAGCATCTGGCGACCCTATCAGTGTGGTGAAAGGACTTATTCTGCTAATAGATCATTCTCA 
GGAACTTCAGTTACAATCTGGACTAAAAGCCAATATAGAGGTCCAGGGTGGTCTAGCTATTGATATTTCA 
GGTGCAATGGAGTTTAGCTTGTGGTATCGTGAGTCTAAAACCCGAGTGAAAAATAGGGTGACTGTGGTAA 
TAACCACTGACATCACAGTGGACTCCTCTTTTGTGAAAGCTGGCCTGGAAACCAGTACAGAAACAGAAGC 
AGGCTTGGAGTTTATCTCCACAGTGCAGTTTTCTCAGTACCCATTCTTAGTTTGCATGCAGATGGACAAG 
GATGAAGCTCCATTCAGGCAATTTGAGAAAAAGTACGAAAGGCTGTCCACAGGCAGAGGTTATGTCTCTC 
AGAAAAGAAAAGAAAGCGTATTAGCAGGATGTGAATTCCCGCTCCATCAAGAGAACTCAGAGATGTGCAA 
AGTGGTGTTTGCCCCTCAGCCGGATAGTACTTCCAGCGGATGGTTTTGAAACTGACCTGTGATATTTTAC 
TTGAATTTGTCTCCCCGAAAGGGACACAATGTGGCATGACTAAGTACTTGCTCTCTGAGAGCACAGCGTT 
TACATATTTACCTGTATTTAAGATTTTTGTAAAAAGCTACAAAAAACTGCAGTTTGATCAAATTTGGGTA 
TATGCAGTATGCTACCCACAGCGTCATTTTGAATCATCATGTGACGCTTTCAACAACGTTCTTAGTTTAC 
TTATACCTCTCTCAAATCTCATTTGGTACAGTCAGAATAGTTATTCTCTAAGAGGAAACTAGTGTTTGTT 
AAAAAC AAAAATAAAAACAAAACC ACACAAG G AG AAC C C AAT T T T G T T T CAAC AAT T T T T GAT C AAT G T A 
TATGAAGCTCTTGATAGGACTTCCTTAAGCATGACGGGAAAACCAAACACGTTCCCTAATCAGGAAAAAA 
AAAAAAAAAAAAAAGTAAGACACAAACAAACCATTTTTTTCTCTTTTTTTGGAGTTGGGGGCCCAGGGAG 
AAGGGACAAGGCTTTTAAAAGACTTGTTAGCCAACTTCAAGAATTAATATTTATGTCTCTGTTATTGTTA 
GTTTTAAGCCTTAAGGTAGAAGGCACATAGAAATAACATC (SEQ ID NO: 6683) 

>gi|1217 638|emb|X91148.1|HSMTTP H. sapiens mRNA for microsomal triglyceride 
transfer protein 

TGCAGTTGAGGATTGCTGGTCAATATGATTCTTCTTGCTGTGCTTTTTCTCTGCTTCATTTCCTCATATT 
CAGCTTCTGTTAAAGGTCACACAACTGGTCTCTCATTAAATAATGACCGGCTGTACAAGCTCACGTACTC 
CACTGAAGTTCTTCTTGATCGGGGCAAAGGAAAACTGCAAGACAGCGTGGGCTACCGCATTTCCTCCAAC 
GTGGATGTGGCCTTACTATGGAGGAATCCTGATGGTGATGATGACCAGTTGATCCAAATAACGATGAAGG 
ATGTAAATGTTGAAAATGTGAATCAGCAGAGAGGAGAGAAGAGCATCTTCAAAGGMAAAGCCCATCTAA 
AATAATGGGAAAGGAARACTTGGAAGCTCTGCAAAGACCTACGCTCCTTCATCTAATCCATGGAAAGGTC 
AAAGAGTTCTACTCATATCAAAATGAGGCAGTGGCCATAGAAAATATCAAGAGAGGTCTGGCTAGCCTAT 
TTCAGACACAGTTAAGCTCTGGAACCACCAATGAGGTAGATATCTCTGGAAATTGTAAAGTGACCTACCA 
GGCTCATCAAGACAARGTGATCAAAATTAAGGCCTTGGATTCATGCAAAATAGCGAGGTCTGGATTTACG 
ACCCCAAATCAGGTCTTGGGTGTCAGTTCAAAAGCTACATCTGTCACCACCTATAAGATAGAAGACAGCT 
TTGTTATAGCTGTGCTTGCTGAAGAAACACACAATTTTGGACTGAATTTCCTACAAACCATTAAGGGGAA 
AATAGTATCGAAGCAGAAATTAGAGCTGAAGACAACCGAAGCAGGCCCAAGATTGATGTCTGGAAAGCAG 
GCTGCAGCCATAATCAAAGCAGTTGATTCAAAGTACACGGCCATTCCCATTGTGGGGCAGGTCTTCCAGA 
GCCACTGTAAAGGATGTCCTTCTCTCTCGGAGCTCTGGCGGTCCACCAGGAAATACCTGCAGCCTGACAA 
CCTTTCCAAGGCTGAGGCTGTCAGAAACTTCCTGGCCTTCATTCAGCACCTCAGGACTGCGAAGAAAGAA 
GAGATCCTTCAAATACTAAAGATGGAAAATAAGGAAGTATTACCTCAGCTGGTGGATGCTGTCACCTCTG 
CTCAGACCTCAGACTCATTAGAAGCCATTTTGGACTTTTTGGATTTCAAAAGTGACAGCAGCATTATCCT 
CCAGGAGAGGTTTCTCTATGCCTGTGGATTTGCTTCTCATCCCAATGAAGAACTCCTGAGAGCCCTCATT 
AGTAAGTTCAAAGGTTCTATTGGTAGCAGTGACATCAGAGAAACTGTTATGATCATCACTGGGACACTTG 
TCAGAAAGTTGTGTCAGAATGAAGGCTGCAAACTCAAAGCAGTAGTGGAAGCTAAGAAGTTAATCCTGGG 
AGGACTTGAAAAAGCAGAGAAAAAAGAGGACACCAGGATGTATCTGCTGGCTTTGAAGAATGCCCTGCTT 
CCAGAAGGCATCCCAAGTCTTCTGAAGTATGCAGAAGCAGGAGAAGGGCCCATCAGCCACCTGGCTACCA 
CTGCTCTCCAGAGATATGATGCTCCCTTTCATAACTGATGAGGTGAAGAAGACCTTAAACAGAATATACC 
ACCAAAACCGTAAAGTTCATGAAAAGACTGTGCGCACTGCTGCAGCTGCTATCATTTTAAATAACAATCC 
ATCCTACATGGACGTCAAGAACATCCTGCTGTCTATTGGGGAGCTTCCCCAAGAAATGAATAAATACATG 
CTCGCCATTGTTCAAGACATCCTACGTTTTGAAATGCCTGCAAGCAAAATTGTCCGTCGAGTTCTGAAGG 
AAATGGTCGCTCACAATTATGACCGTTTCTCCAGGAGTGGATCTTCTTCTGCCTACACTGGCTACATAGA 
ACGTAGTCCCCGTTCGGCATCTACTTACAGCCTAGACATTCTCTACTCGGGTTCTGGCATTCTAAGGAGA 
AGTAACCTGAACATCTTTCAGTACATTGGGAAGGCTGGTCTTCACGGTAGCCAGGTGGTTATTGAAGCCC 
AAGGACTGGAAGCCTTAATCGCAGCCACCCCTGACGAGGGGGAGGAGAACCTTGACTCCTATGCTGGTAT 
GTCAGCCATCCTCTTTGATGTTCAGCTCAGACCTGTCACCTTTTTCAACGGATACAGTGATTTGATGTCC 
AAAATGCTGTCAGCATCTGGCGACCCTATCAGTGTGGTGAAAGGACTTATTCTGCTAATAGATCATTCTC 
AGGAACTTCAGTTACAATCTGGACTAAAAGCCAATATAGAGGTCCAGGGTGGTCTAGCTATTGATATTTC 
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AGGTGCAATGGAGTTTAGCTTGTGGTATCGTGAGTCTAAAACCCGAGTGAAAAATAGGGTGACTGTGGTA 
ATAACCACTGACATCACAGTGGACTCCTCTTTTGTGAAAGCTGGCCTGGAAACCAGTACAGAAACAGAAG 
CAGGCTTGGAGTTTATCTCCACAGTGCAGTTTTCTCAGTACCCATTCTTAGTTTGCATGCAGATGGACAA 
GGATGAAGCTCCATTCAGGCAATTTGAGAAAAAGTACGAAAGGCTGTCCACAGGCAGAGGTTATGTCTCT 
5 CAGAAAAGAAAAGAAAGCGTATTAGCAGGATGTGAATTCCCGCTCCATCAAGAGAACTCAGAGATGTGCA 
AAGTGGTGTTTGCCCCTCAGCCGGATAGTACTTCCAGCGGATGGTTTTGAAACTGACCTGTGATATTTTA 
CTTGAATTTGTCTCCCCGAAAGGGACACAATGTGGCATGACTAAGTACTTGCTCTCTGAGAGCACAGCGT 
TTACATATTTACCTGTATTTAAGATTTTTGTAAAAAGCTACAAAAAACTGCAGTTTGATCAAATTTGGGT 
ATATGCAGTATGCTACCCACAGCGTCATTTTGAATCATCATGTGACGCTTTCAACAACGTTCTTAGTTTA 

10 CTTATACCTCTCTCAAATCTCATTTGGTACAGTCAGAATAGTTATTCTCTAAGAGGAAACTAGTGTTTGT 
TAAAAACAAAAATAAAAACAAAACCACACAAGGAGAACCCAATTTTGTTTCAACAATTTTTGATCAATGT 
ATATGAAGCTCTTGATAGGACTTCCTTAAGCATGACGGGAAAACCAAACACGTTCCCTAATCAGGAAAAA 
AAAAAAAAAAGAAAAAGTAAGACACAAACAAACCATTTTTTTCTCTTTTTTTGGAGTTGGGGGCCCAGGG 
AGAAGGGACAAGGCTTTTAAAAGACTTGTTAGCCAACTTCAAGAATTAATATTTATGTCTCTGTTATTGT 

15 TAGTTTTAAGCCTTAAGGTAGAAGGCACATAGAAATAACATCTCATCTTTCTGCTGACCATTTTAGTGAG 
GTTGTTCCAAAGAGCATTCAGGTCTCTACCTCCAGCCCTGCAAAAATATTGGACCTAGCACAGAGGAATC 
AGGAAAATTAATTTCAGAAACTCCATTTGATTTTTCTTTTGCTGTGTCTTTTTTGAGACTGTAATATGGT 
ACACTGTCCTCTAAGGACATCCTCATTTTATCTCACCTTTTTGGGGGTGAGAGCTCTAGTTCATTTAACT 
GTACTCTGCACAATAGCTAGGATGACTAAGAGAACATTGCTTCAAGAAACTGGTGGATTTGGATTTCCAA 

20 AATATGAAATAAGGAGAAAAATGTTTTTATTTGTATGAATTAAAAGATCCATGTTGAACATTTGCAAATA 
TTTATTAATAAACAGATGTGGTGATAAACCCAAAACAAATGACAGGTGCTTATTTTCCACTAAACACAGA 
CACATGAAATGAAAGTTTAGCTAGCCCACTATTTGTTGTAAATTGAAAACGAAGTGTGATAAAATAAATA 
TGTAGAAATCAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 6684) 



25 

>gi I 21361125 | ref | NM_001467 . 2 | Homo sapiens glucose-6-phosphatase, 
transport (glucose-6-phosphate) protein 1 (G6PT1 } , mRNA 

GGCACGAGGGGCCACCGAGGCGCTGTCCCTGACCACCAGCACGAGACCCCTTTCTATCGCGCCAGTCCTG 
TGGTCTCCGCACCTCTCCAGCTCCTGCACCCCCGGCCCCCGTGGTTCCCAGCCGCACAGTAGCGTGTCCT 

30 GGGTAGCGTGAGGACCCACGGGGCTGAGCAGGTGCCACGAGCCCGCCGCCTCTTCGCCGCCCGCCGCCTC 
TCCTCCTCTCCCGCCCGCCGCCTGGCCCTCCCCTACCAGGCTGAGCCTCTGGCTGCCAGAAGCGCGGGGC 
CTCCGGGAGAATACGTGCGGTCGCCCGCTCCGCGTGCGCCTACGCCTTCTGCTCCAGTTGCTTTCCCAAT 
TGAGCGGAAAAGCCGGGGCATGTTGCCGGGGCCCTGGGCGGGACGGTTGTGCCCTGCAGCCCGAAGCCCG 
CCGGGGCACCTTCCCGCCCACGAGCTGCCCAGTCCCTCTGCTTGCGGCCCCTGCCAACGTCCCACAGGAC 

35 ACTGGGTCCCCTTGGAGCCTCCCCAGGCTTAATGATTGTCCAGAAGGCGGCTATAAAGGGAGCCTGGGAG 
GCTGGGTGGAGGAGGGAGCAGAAAAAACCCAACTCAGCAGATCTGGGAACTGTGAGAGCGGCAAGCAGGA 
ACTGTGGTCAGAGGCTGTGCGTCTTGGCTGGTAGGGCCTGCTCTTTTCTACCATGGCAGCCCAGGGCTAT 
GGCTATTATCGCACTGTGATCTTCTCAGCCATGTTTGGGGGCTACAGCCTGTATTACTTCAATCGCAAGA 
CCTTCTCCTTTGTCATGCCATCATTGGTGGAAGAGATCCCTTTGGACAAGGATGATTTGGGGTTCATCAC 

40 CAGCAGCCAGTCGGCAGCTTATGCTATCAGCAAGTTTGTCAGTGGGGTGCTGTCTGACCAGATGAGTGCT 
CGCTGGCTCTTCTCTTCTGGGCTGCTCCTGGTTGGCCTGGTCAACATATTCTTTGCCTGGAGCTCCACAG 
TACCTGTCTTTGCTGCCCTCTGGTTCCTTAATGGCCTGGCCCAGGGGCTGGGCTGGCCCCCATGTGGGAA 
GGTCCTGCGGAAGTGGTTTGAGCCATCTCAGTTTGGCACTTGGTGGGCCATCCTGTCAACCAGCATGAAC 
CTGGCTGGAGGGCTGGGCCCTATCCTGGCAACCATCCTTGCCCAGAGCTACAGCTGGCGCAGCACGCTGG 

45 CCCTATCTGGGGCACTGTGTGTGGTTGTCTCCTTCCTCTGTCTCCTGCTCATCCACAATGAACCTGCTGA 
TGTTGGACTCCGCAACCTGGACCCCATGCCCTCTGAGGGCAAGAAGGGCTCCTTGAAGGAGGAGAGCACC 
CTGCAGGAGCTGCTGCTGTCCCCTTACCTGTGGGTGCTCTCCACTGGTTACCTTGTGGTGTTTGGAGTAA 
AGACCTGCTGTACTGACTGGGGCCAGTTCTTCCTTATCCAGGAGAAAGGACAGTCAGCCCTTGTAGGTAG 
CTCCTACATGAGTGCCCTGGAAGTTGGGGGCCTTGTAGGCAGCATCGCAGCTGGCTACCTGTCAGACCGG 

50 GCCATGGCAAAGGCGGGACTGTCCAACTACGGGAACCCTCGCCATGGCCTGTTGCTGTTCATGATGGCTG 
GCATGACAGTGTCCATGTACCTCTTCCGGGTAACAGTGACCAGTGACTCCCCCAAGCTCTGGATCCTGGT 
ATTGGGAGCTGTATTTGGTTTCTCCTCGTATGGCCCCATTGCCCTGTTTGGAGTCATAGCCAACGAGAGT 
GCCCCTCCCAACTTGTGTGGCACCTCCCACGCCATTGTGGGACTCATGGCCAATGTGGGCGGCTTTCTGG 
CTGGGCTGCCCTTCAGCACCATTGCCAAGCACTACAGTTGGAGCACAGCCTTCTGGGTGGCTGAAGTGAT 

55 TTGTGCGGCCAGCACGGCTGCCTTCTTCCTCCTACGAAACATCCGCACCAAGATGGGCCGAGTGTCCAAG 
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AAGGCTGAGTGAAGAGAGTCCAGGTTCCGGAGCACCATCCCACGGTGGCCTTCCCCCTGCACGCTCTGCG 
GGGAGAAAAGGAGGGGCCTGCCTGGCTAGCCCTGAACCTTTCACTTTCCATTTCTGCGCCTTTTCTGTCA 
CCCGGGTGGCGCTGGAAGTTATCAGTGGCTAGTGAGGTCCCAGCTCCCTGATCCTATGCTCTATTTAAAA 
GATAACCTTTGGCCTTAGACTCCGTTAGCTCCTATTTCCTGCCTTCAGACAAACAGGAAACTTCTGCAGT 
5 CAGGAAGGCTCCTGTACCCTTCTTCTTTTCCTAGGCCCTGTCCTGCCCGCATCCTACCCCATCCCCACCT 
GAAGTGAGGCTATCCCTGCAGCTGCAGGGCACTAATGACCCTTGACTTCTGCTGGGTCCTAAGTCCTCTC 
AGCAGTGGGTGACTGCTGTTGCCAATACCTCAGACTCCAGGGAAAGAGAGGAGGCCATCATTCTCACTGT 
ACCACTAGGCGCAGTTGGATATAGGTGGGAAGAAAAGGTGACTTGTTATAGAAGATTAAAACTAGATTTG 
ATACTGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 6685) 

10 



gi I 4503130 |ref|NM_001904.1| Homo sapiens catenin (cadherin-associated 
protein), beta 1, 88kDa (CTNNB1) , mRNA 

AAGCCTCTCGGTCTGTGGCAGCAGCGTTGGCCCGGCCCCGGGAGCGGAGAGCGAGGGGAGGCGGAGACGG 

15 AGGAAGGTCTGAGGAGCAGCTTCAGTCCCCGCCGAGCCGCCACCGCAGGTCGAGGACGGTCGGACTCCCG 
CGGCGGGAGGAGCCTGTTCCCCTGAGGGTATTTGAAGTATACCATACAACTGTTTTGAAAATCCAGCGTG 
GACAATGGCTACTCAAGCTGATTTGATGGAGTTGGACATGGCCATGGAACCAGACAGAAAAGCGGCTGTT 
AGTCACTGGCAGCAACAGTCTTACCTGGACTCTGGAATCCATTCTGGTGCCACTACCACAGCTCCTTCTC 
TGAGTGGTAAAGGCAATCCTGAGGAAGAGGATGTGGATACCTCCCAAGTCCTGTATGAGTGGGAACAGGG 

20 ATTTTCTCAGTCCTTCACTCAAGAACAAGTAGCTGATATTGATGGACAGTATGCAATGACTCGAGCTCAG 
AGGGTACGAGCTGCTATGTTCCCTGAGACATTAGATGAGGGCATGCAGATCCCATCTACACAGTTTGATG 
CTGCTCATCCCACTAATGTCCAGCGTTTGGCTGAACCATCACAGATGCTGAAACATGCAGTTGTAAACTT 
GATTAACTATCAAGATGATGCAGAACTTGCCACACGTGCAATCCCTGAACTGACAAAACTGCTAAATGAC 
GAGGACCAGGTGGTGGTTAATAAGGCTGCAGTTATGGTCCATCAGCTTTCTAAAAAGGAAGCTTCCAGAC 

25 ACGCTATCATGCGTTCTCCTCAGATGGTGTCTGCTATTGTACGTACCATGCAGAATACAAATGATGTAGA 
AACAGCTCGTTGTACCGCTGGGACCTTGCATAACCTTTCCCATCATCGTGAGGGCTTACTGGCCATCTTT 
AAGTCTGGAGGCATTCCTGCCCTGGTGAAAATGCTTGGTTCACCAGTGGATTCTGTGTTGTTTTATGCCA 
TTACAACTCTCCACAACCTTTTATTACATCAAGAAGGAGCTAAAATGGCAGTGCGTTTAGCTGGTGGGCT 
GCAGAAAATGGTTGCCTTGCTCAACAAAACAAATGTTAAATTCTTGGCTATTACGACAGACTGCCTTCAA 

30 ATTTTAGCTTATGGCAACCAAGAAAGCAAGCTCATCATACTGGCTAGTGGTGGACCCCAAGCTTTAGTAA 
ATATAATGAGGACCTATACTTACGAAAAACTACTGTGGACCACAAGCAGAGTGCTGAAGGTGCTATCTGT 
CTGCTCTAGTAATAAGCCGGCTATTGTAGAAGCTGGTGGAATGCAAGCTTTAGGACTTCACCTGACAGAT 
CCAAGTCAACGTCTTGTTCAGAACTGTCTTTGGACTCTCAGGAATCTTTCAGATGCTGCAACTAAACAGG 
AAGGGATGGAAGGTCTCCTTGGGACTCTTGTTCAGCTTCTGGGTTCAGATGATATAAATGTGGTCACCTG 

35 TGCAGCTGGAATTCTTTCTAACCTCACTTGCAATAATTATAAGAACAAGATGATGGTCTGCCAAGTGGGT 
GGTATAGAGGCTCTTGTGCGTACTGTCCTTCGGGCTGGTGACAGGGAAGACATCACTGAGCCTGCCATCT 
GTGCTCTTCGTCATCTGACCAGCCGACACCAAGAAGCAGAGATGGCCCAGAATGCAGTTCGCCTTCACTA 
TGGACTACCAGTTGTGGTTAAGCTCTTACACCCACCATCCCACTGGCCTCTGATAAAGGCTACTGTTGGA 
♦TTGATTCGAAATCTTGCCCTTTGTCCCGCAAATCATGCACCTTTGCGTGAGCAGGGTGCCATTCCACGAC 

40 TAGTTCAGTTGCTTGTTCGTGCACATCAGGATACCCAGCGCCGTACGTCCATGGGTGGGACACAGCAGCA 
ATTTGTGGAGGGGGTCCGCATGGAAGAAATAGTTGAAGGTTGTACCGGAGCCCTTCACATCCTAGCTCGG 
GATGTTCACAACCGAATTGTTATCAGAGGACTAAATACCATTCCATTGTTTGTGCAGCTGCTTTATTCTC 
CCATTGAAAACATCCAAAGAGTAGCTGCAGGGGTCCTCTGTGAACTTGCTCAGGACAAGGAAGCTGCAGA 
AGCTATTGAAGCTGAGGGAGCCACAGCTCCTCTGACAGAGTTACTTCACTCTAGGAATGAAGGTGTGGCG 

45 ACATATGCAGCTGCTGTTTTGTTCCGAATGTCTGAGGACAAGCCACAAGATTACAAGAAACGGCTTTCAG 
TTGAGCTGACCAGCTCTCTCTTCAGAACAGAGCCAATGGCTTGGAATGAGACTGCTGATCTTGGACTTGA 
TATTGGTGCCCAGGGAGAACCCCTTGGATATCGCCAGGATGATCCTAGCTATCGTTCTTTTCACTCTGGT 
GGATATGGCCAGGATGCCTTGGGTATGGACCCCATGATGGAACATGAGATGGGTGGCCACCACCCTGGTG 
CTGACTATCCAGTTGATGGGCTGCCAGATCTGGGGCATGCCCAGGACCTCATGGATGGGCTGCCTCCAGG 

50 TGACAGCAATCAGCTGGCCTGGTTTGATACTGACCTGTAAATCATCCTTTAGCTGTATTGTCTGAACTTG 
CATTGTGATTGGCCTGTAGAGTTGCTGAGAGGGCTCGAGGGGTGGGCTGGTATCTCAGAAAGTGCCTGAC 
ACACTAACCAAGCTGAGTTTCCTATGGGAACAATTGAAGTAAACTTTTTGTTCTGGTCCTTTTTGGTCGA 
GGAGTAACAATACAAATGGATTTTGGGAGTGACTCAAGAAGTGAAGAATGCACAAGAATGGATCACAAGA 

55 GTACTGACTTTGCTTGCTTTGAAGTAGCTCTTTTTTTTTTTTTTTTTTTTTTTTTTTGCAGTAACTGTTT 
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TTTAAGTCTCTCGTAGTGTTAAGTTATAGTGAATACTGCTACAGCAATTTCTAATTTTTAAGAATTGAGT 
AATGGTGTAGAACACTAATTAATTCATAATCACTCTAATTAATTGTAATCTGAATAAAGTGTAACAATTG 
TGTAGCCTTTTTGTATAAAATAGACAAATAGAAAATGGTCCAATTAGTTTCCTTTTTAATATGCTTAAAA 
TAAGCAGGTGGATCTATTTCATGTTTTTGATCAAAAACTATTTGGGATATGTATGGGTAGGGTAAATCAG 
5 TAAGAGGTGTTATTTGGAACCTTGTTTTGGACAGTTTACCAGTTGCCTTTTATCCCAAAGTTGTTGTAAC 
CTGCTGTGATACGATGCTTCAAGAGAAAATGCGGTTATAAAAAATGGTTCAGAATTAAACTTTTAATTCA 
TT (SEQ ID NO: 6686) 



10 gi 1 18104 977 I ref 1 NM_002827 . 2 | Homo sapiens protein tyrosine phosphatase, 
non-receptor type 1 (PTPN1), mRNA 

GTGATGCGTAGTTCCGGCTGCCGGTTGACATGAAGAAGCAGCAGCGGCTAGGGCGGCGGTAGCTGCAGGG 
GTCGGGGATTGCAGCGGGCCTGGGGGCTAAGAGCGCGACGCGGCCTAGAGCGGCAGACGGCGCAGTGGGC 
CGAGAAGGAGGCGCAGCAGCCGCCCTGGCCCGTCATGGAGATGGAAAAGGAGTTCGAGCAGATCGACAAG 

15 TCCGGGAGCTGGGCGGCCATTTACCAGGATATCCGACATGAAGCCAGTGACTTCCCATGTAGAGTGGCCA 
AGCTTCCTAAGAACAAAAACCGAAATAGGTACAGAGACGTCAGTCCCTTTGACCATAGTCGGATTAAACT 
ACATCAAGAAGATAATGACTATATCAACGCTAGTTTGATAAAAATGGAAGAAGCCCAAAGGAGTTACATT 
CTTACCCAGGGCCCTTTGCCTAACACATGCGGTCACTTTTGGGAGATGGTGTGGGAGCAGAAAAGCAGGG 
GTGTCGTCATGCTCAACAGAGTGATGGAGAAAGGTTCGTTAAAATGCGCACAATACTGGCCACAAAAAGA 

20 AGAAAAAG AGATG AT C T T T GAAGAC ACAAAT TT G AAAT TAAC AT T G AT CT C T G AAGAT AT CAAG T CAT AT 
TATACAGTGCGACAGCTAGAATTGGAAAACCTTACAACCCAAGAAACTCGAGAGATCTTACATTTCCACT 
ATACGACATGGCCTGACTTTGGAGTCCCTGAATCACCAGCCTCATTCTTGAACTTTCTTTTCAAAGTCCG 
AGAGTCAGGGTCACTCAGCCCGGAGCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGGCAGGTCT 
GGAACCTTCTGTCTGGCTGATACCTGCCTCTTGCTGATGGACAAGAGGAAAGACCCTTCTTCCGTTGATA 

25 TCAAGAAAGTGCTGTTAGAAATGAGGAAGTTTCGGATGGGGCTGATCCAGACAGCCGACCAGCTGCGCTT 
CTCCTACCTGGCTGTGATCGAAGGTGCCAAATTCATCATGGGGGACTCTTCCGTGCAGGATCAGTGGAAG 
GAGCTTTCCCACGAGGACCTGGAGCCCCCACCCGAGCATATCCCCCCACCTCCCCGGCCACCCAAACGAA 
TCCTGGAGCCACACAATGGGAAATGCAGGGAGTTCTTCCCAAATCACCAGTGGGTGAAGGAAGAGACCCA 
GGAGGATAAAGACTGCCCCATCAAGGAAGAAAAAGGAAGCCCCTTAAATGCCGCACCCTACGGCATCGAA 

30 AGCATGAGTCAAGACACTGAAGTTAGAAGTCGGGTCGTGGGGGGAAGTCTTCGAGGTGCCCAGGCTGCCT 
CCCCAGCCAAAGGGGAGCCGTCACTGCCCGAGAAGGACGAGGACCATGCACTGAGTTACTGGAAGCCCTT 
CCTGGTCAACATGTGCGTGGCTACGGTCCTCACGGCCGGCGCTTACCTCTGCTACAGGTTCCTGTTCAAC 
AGCAACACATAGCCTGACCCTCCTCCACTCCACCTCCACCCACTGTCCGCCTCTGCCCGCAGAGCCCACG 
CCCGACTAGCAGGCATGCCGCGGTAGGTAAGGGCCGCCGGACCGCGTAGAGAGCCGGGCCCCGGACGGAC 

35 GTTGGTTCTGCACTAAAACCCATCTTCCCCGGATGTGTGTCTCACCCCTCATCCTTTTACTTTTTGCCCC 
TTCCACTTTGAGTACCAAATCCACAAGCCATTTTTTGAGGAGAGTGAAAGAGAGTACCATGCTGGCGGCG 
CAGAGGGAAGGGGCCTACACCCGTCTTGGGGCTCGCCCCACCCAGGGCTCCCTCCTGGAGCATCCCAGGC 
GGGCGGCACGCCAACAGCCCCCCCCTTGAATCTGCAGGGAGCAACTCTCCACTCCATATTTATTTAAACA 
ATTTTTTCCCCAAAGGCATCCATAGTGCACTAGCATTTTCTTGAACCAATAATGTATTAAAATTTTTTGA 

40 TGTCAGCCTTGCATCAAGGGCTTTATCAAAAAGTACAATAATAAATCCTCAGGTAGTACTGGGAATGGAA 
GGCTTTGCCATGGGCCTGCTGCGTCAGACCAGTACTGGGAAGGAGGACGGTTGTAAGCAGTTGTTATTTA 
GTGATATTGTGGGTAACGTGAGAAGATAGAACAATGCTATAATATATAATGAACACGTGGGTATTTAATA 
AGAAACATGATGTGAGATTACTTTGTCCCGCTTATTCTCCTCCCTGTTATCTGCTAGATCTAGTTCTCAA 
TCACTGCTCCCCCGTGTGTATTAGAATGCATGTAAGGTCTTCTTGTGTCCTGATGAAAAATATGTGCTTG 

45 AAATGAGAAACTTTGATCTCTGCTTACTAATGTGCCCCATGTCCAAGTCCAACCTGCCTGTGCATGACCT 
GATCATTACATGGCTGTGGTTCCTAAGCCTGTTGCTGAAGTCATTGTCGCTCAGCAATAGGGTGCAGTTT 
TCCAGGAATAGGCATTTGCCTAATTCCTGGCATGACACTCTAGTGACTTCCTGGTGAGGCCCAGCCTGTC 
CTGGTACAGCAGGGTCTTGCTGTAACTCAGACATTCCAAGGGTATGGGAAGCCATATTCACACCTCACGC 
TCTGGACATGATTTAGGGAAGCAGGGACACCCCCCGCCCCCCACCTTTGGGATCAGCCTCCGCCATTCCA 

50 AGTCAACACTCTTCTTGAGCAGACCGTGATTTGGAAGAGAGGCACCTGCTGGAAACCACACTTCTTGAAA 
CAGCCTGGGTGACGGTCCTTTAGGCAGCCTGCCGCCGTCTCTGTCCCGGTTCACCTTGCCGAGAGAGGCG 
CGTCTGCCCCACCCTCAAACCCTGTGGGGCCTGATGGTGCTCACGACTCTTCCTGCAAAGGGAACTGAAG 
ACCTCCACATTAAGTGGCTTTTTAACATGAAAAACACGGCAGCTGTAGCTCCCGAGCTACTCTCTTGCCA 
GCATTTTCACATTTTGCCTTTCTCGTGGTAGAAGCCAGTACAGAGAAATTGTGTGGTGGGAACATTCGAG 

55 GTGTCACCCTGCAGAGCTATGGTGAGGTGTGGATAAGGCTTAGGTGCCAGGCTGTAAGCATTCTGAGCTG 
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GGCTTGTTGTTTTTAAGTCCTGTATATGTATGTAGTAGTTTGGGTGTGTATATATAGTAGCATTTCAAAA 
TGGACGTACTGGTTTAACCTCCTATCCTTGGAGAGCAGCTGGCTCTCCACCTTGTTACACATTATGTTAG 
AGAGGTAGCGAGCTGCTCTGCTATATGCCTTAAGCCAATATTTACTCATCAGGTCATTATTTTTTACAAT 
GGCCATGGAATAAACCATTTTTACAAAA (SEQ ID NO: 6687) 

5 



gi 1 12831192 | gb | AF333324.il Hepatitis C virus type lb polyprotein mRNA, 
complete cds 

GCCAGCCCCCGATTGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCA 

10 GAAAGCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCA 
TAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCCG 
CTCAATGCCTGGAGATTTGGGCGTGCCCCCGCGAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCC 
TTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCATCATGAGCACA 
AATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACAGGACGTTAAGTTCCCGGGCG 

15 GTGGTCAGATCGTTGGTGGAGTTTACCTGTTGCCGCGCAGGGGCCCCAGGTTGGGTGTGCGCGCGACTAG 
GAAGACTTCCGAGCGGTCGCAACCTCGTGGAAGGCGACAACCTATCCCCAAGGCTCGCCGGCCCGAGGGT 
AGGACCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAACGAGGGTATGGGGTGGGCAGGATGGC 
TCCTGTCACCCCGTGGCTCTCGGCCTAGTTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGTAATTTGGG 
TAAGGTCATCGATACCCTTACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTTGTCGGCGCCCCC 

20 CTAGGAGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAGGACGGCGTGAACTATGCAACAG 
GGAATCTGCCCGGTTGCTCTTTCTCTATCTTCCTCTTAGCTTTGCTGTCTTGTTTGACCATCCCAGCTTC 
CGCTTACGAGGTGCGCAACGTGTCCGGGATATACCATGTCACGAACGACTGCTCCAACTCAAGTATTGTG 
TATGAGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTCCGGGAGAGTAATTTCTCCC 
GTTGCTGGGTAGCGCTCACTCCCACGCTCGCGGCCAGGAACAGCAGCATCCCCACCACGACAATACGACG 

25 CCACGTCGATTTGCTCGTTGGGGCGGCTGCTCTCTGTTCCGCTATGTACGTTGGGGATCTCTGCGGATCC 
GTTTTTCTCGTCTCCCAGCTGTTCACCTTCTCACCTCGCCGGTATGAGACGGTACAAGATTGCAATTGCT 
CAATCTATCCCGGCCACGTATCAGGTCACCGCATGGCTTGGGATATGATGATGAACTGGTCACCTACAAC 
GGCCCTAGTGGTATCGCAGCTACTCCGGATCCCACAAGCCGTCGTGGACATGGTGGCGGGGGCCCACTGG 
GGTGTCCTAGCGGGCCTTGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTCTTGAT-TGTGATGCTAC 

30 TCTTTGCTGGCGTTGACGGGCACACCCACGTGACAGGGGGAAGGGTAGCCTCCAGCACCCAGAGCCTCGT 
GTCCTGGCTCTCACAAGGGCCATCTCAGAAAATCCAACTCGTGAACACCAACGGCAGCTGGCACATCAAC 
AGGACCGCTCTGAATTGCAATGACTCCCTCCAAACTGGGTTCATTGCTGCGCTGTTCTACGCACACAGGT 
TCAACGCGTCCGGATGTCCAGAGCGCATGGCCAGCTGCCGCCCCATCGACAAGTTCGCTCAGGGGTGGGG 
TCCCATCACTCACGTTGTGCCTAACATCTCGGACCAGAGGCCTTATTGCTGGCACTATGCACCCCAACCG 

35 TGCGGTATTGTACCCGCGTCGCAGGTGTGTGGCCCAGTGTATTGCTTCACCCCGAGTCCTGTTGTGGTGG 
GGACGACCGACCGTTCCGGAGTCCCCACGTATAGCTGGGGGGAGAATGAGACAGACGTGCTGCTACTCAA 
CAACACGCGGCCGCCGCAAGGCAACTGGTTCGGCTGTACATGGATGAATAGCACCGGGTTCACCAAGACG 
TGCGGGGGCCCCCCGTGTAACATCGGGGGGGTTGGCAACAACACCTTGATTTGCCCCACGGATTGCTTCC 
GAAAGCACCCCGAGGCCACTTACACCAAATGCGGCTCGGGTCCTTGGTTGACACCTAGGTGTCTAGTTGA 

40 CTACCCATACAGACTTTGGCACTACCCCTGCACTATCAATTTTACCATCTTCAAGGTCAGGATGTACGTG 
GGGGGCGTGGAGCACAGGCTCAACGCCGCGTGCAATTGGACCCGAGGAGAGCGCTGTGACCTGGAGGACA 
GGGATAGATCAGAGCTTAGCCCGCTGCTATTGTCTACAACGGAGTGGCAGGTACTGCCCTGTTCCTTTAC 
CACCCTACCGGCTCTGTCCACTGGATTGATCCACCTCCATCAGAATATCGTGGACGTGGAATACCTGTAC 
GGTGTAGGGTCAGTGGTTGTCTCCGTCGTAATCAAATGGGAGTATGTTCTGCTGCTCTTCCTTCTCCTGG 

45 CGGACGCGCGCGTCTGTGCCTGCTTGTGGATGATGCTGCTGATAGCCCAGGCTGAGGCCACCTTAGAGAA 
CCTGGTGGTCCTCAATGCGGCGTCTGTGGCCGGAGCGCATGGCCTTCTCTCCTTCCTCGTGTTCTTCTGC 
GCCGCCTGGTACATCAAAGGCAGGCTGGTCCCTGGGGCGGCATATGCTCTCTATGGCGTATGGCCGTTGC 
TCCTGCTCTTGCTGGCTTTACCACCACGAGCTTATGCCATGGACCGAGAGATGGCTGCATCGTGCGGAGG 
CGCGGTTTTTGTAGGTCTGGTACTCTTGACCTTGTCACCATACTATAAGGTGTTCCTCGCTAGGCTCATA 

50 TGGTGGTTACAATATTTTATCACCAGGGCCGAGGCGCACTTGCAAGTGTGGGTCCCCCCTCTTAATGTTC 
GGGGAGGCCGCGATGCCATCATCCTCCTTACATGCGCGGTCCATCCAGAGCTAATCTTTGACATCACCAA 
ACTCCTGCTCGCCATACTCGGTCCGCTCATGGTGCTCCAAGCTGGCATAACCAGAGTGCCGTACTTCGTG 
CGCGCTCAAGGGCTCATTCATGCATGCATGTTAGTGCGGAAGGTCGCTGGGGGTCATTATGTCCAAATGG 
CCTTCATGAAGCTGGGCGCGCTGACAGGCACGTACATTTACAACCATCTTACCCCGCTACGGGATTGGGC 

55 CCACGCGGGCCTACGAGACCTTGCGGTGGCAGTGGAGCCCGTCGTCTTCTCCGACATGGAGACCAAGATC 
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ATCACCTGGGGAGCAGACACCGCGGCGTGTGGGGACATCATCTTGGGTCTGCCCGTCTCCGCCCGAAGGG 

GAAAGGAGATACTCCTGGGCCCGGCCGATAGTCTTGAAGGGCGGGGGTGGCGACTCCTCGCGCCCATCAC 

GGCCTACTCCCAACAGACGCGGGGCCTACTTGGTTGCATCATCACTAGCCTTACAGGCCGGGACAAGAAC 

CAGGTCGAGGGAGAGGTTCAGGTGGTTTCCACCGCAACACAATCCTTCCTGGCGACCTGCGTCAACGGCG 

TGTGTTGGACCGTTTACCATGGTGCTGGCTCAAAGACCTTAGCCGGCCCAAAGGGGCCAATCACCCAGAT 

GTACACTAATGTGGACCAGGACCTCGTCGGCTGGCAGGCGCCCCCCGGGGCGCGTTCCTTGACACCATGC 

ACCTGTGGCAGCTCAGACCTTTACTTGGTCACGAGACATGCTGACGTCATTCCGGTGCGCCGGCGGGGCG 

ACAGTAGGGGGAGCCTGCTCTCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCTTCGGGTGGTCCACTGCT 

CTGCCCTTCGGGGCACGCTGTGGGCATCTTCCGGGCTGCCGTATGCACCCGGGGGGTTGCGAAGGCGGTG 

GACTTTGTGCCCGTAGAGTCCATGGAAACTACTATGCGGTCTCCGGTCTTCACGGACAACTCATCCCCCC 

CGGCCGTACCGCAGTCATTTCAAGTGGCCCACCTACACGCTCCCACTGGCAGCGGCAAGAGTACTAAAGT 

GCCGGCTGCATATGCAGCCCAAGGGTACAAGGTGCTCGTCCTCAATCCGTCCGTTGCCGCTACCTTAGGG 

TTTGGGGCGTATATGTCTAAGGCACACGGTATTGACCCCAACATCAGAACTGGGGTAAGGACCATTACCA 

CAGGCGCCCCCGTCACATACTCTACCTATGGCAAGTTTCTTGCCGATGGTGGTTGCTCTGGGGGCGCTTA 

TGACATCATAATATGTGATGAGTGCCATTCAACTGACTCGACTACAATCTTGGGCATCGGCACAGTCCTG 

GACCAAGCGGAGACGGCTGGAGCGCGGCTTGTCGTGCTCGCCACCGCTACGCCTCCGGGATCGGTCACCG 

TGCCACACCCAAACATCGAGGAGGTGGCCCTGTCTAATACTGGAGAGATCCCCTTCTATGGCAAAGCCAT 

CCCCATTGAAGCCATCAGGGGGGGAAGGCATCTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAGCTC 

GCCGCAAAGCTGTCAGGCCTCGGAATCAACGCTGTGGCGTATTACCGGGGGCTCGATGTGTCCGTCATAC 

CAACTATCGGAGACGTCGTTGTCGTGGCAACAGACGCTCTGATGACGGGCTATACGGGCGACTTTGACTC 

AGTGATCGACTGTAACACATGTGTCACCCAGACAGTCGACTTCAGCTTGGATCCCACCTTCACCATTGAG 

ACGACGACCGTGCCTCAAGACGCAGTGTCGCGCTCGCAGCGGCGGGGTAGGACTGGCAGGGGTAGGAGAG 

GCATCTACAGGTTTGTGACTCCGGGAGAACGGCCCTCGGGCATGTTCGATTCCTCGGTCCTGTGTGAGTG 

CTATGACGCGGGCTGTGCTTGGTACGAGCTCACCCCCGCCGAGACCTCGGTTAGGTTGCGGGCCTACCTG 

AACACACCAGGGTTGCCCGTTTGCCAGGACCACCTGGAGTTCTGGGAGAGTGTCTTCACAGGCCTCACCC 

ACATAGATGCACACTTCTTGTCCCAGACCAAGCAGGCAGGAGACAACTTCCCCTACCTGGTAGCATACCA 

AGCCACGGTGTGCGCCAGGGCTCAGGCCCCACCTCCATCATGGGATCAAATGTGGAAGTGTCTCATACGG 

CTGAAACCTACGCTGCACGGGCCAACACCCTTGCTGTACAGGCTGGGAGCCGTCCAAAATGAGGTCACCC 

TCACCCACCCCATAACCAAATACATCATGGCATGCATGTCGGCTGACCTGGAGGTCGTCACTAGCACCTG 

GGTGCTGGTGGGCGGAGTCCTTGCAGCTCTGGCCGCGTATTGCCTGACAACAGGCAGTGTGGTCATTGTG 

GGTAGGATTATCTTGTCCGGGAGGCCGGCTATTGTTCCCGACAGGGAGCTTCTCTACCAGGAGTTCGATG 

AAATGGAAGAGTGCGCCACGCACCTCCCTTACATTGAGCAGGGAATGCAGCTCGCCGAGCAGTTCAAGCA 

GAAAGCGCTCGGGTTACTGCAAACAGCCACCAAACAAGCGGAGGCTGCTGCTCCCGTGGTGGAGTCCAAG 

TGGCGAGCCCTTGAGACATTCTGGGCGAAGCACATGTGGAATTTCATCAGCGGGATACAGTACTTAGCAG 

GCTTATCCACTCTGCCTGGGAACCCCGCAATAGCATCATTGATGGCATTCACAGCCTCTATCACCAGCCC 

GCTCACCACCCAAAGTACCCTCCTGTTTAACATCTTGGGGGGGTGGGTGGCTGCCCAACTCGCCCCCCCC 

AGCGCCGCTTCGGCTTTCGTGGGCGCCGGCATCGCCGGTGCGGCTGTTGGCAGCATAGGCCTTGGGAAGG 

TGCTTGTGGACATTCTGGCGGGTTATGGAGCAGGAGTGGCCGGCGCGCTCGTGGCCTTTAAGGTCATGAG 

CGGCGAGATGCCCTCTACCGAGGACCTGGTCAATCTACTTCCTGCCATCCTCTCTCCTGGCGCCCTGGTC 

GTCGGGGTCGTGTGTGCAGCAATACTGCGTCGGCACGTGGGTCCGGGAGAGGGGGCTGTGCAGTGGATGA 

ACCGGCTGATAGCGTTCGCCTCGCGGGGTAATCACGTTTCCCCCACGCACTATGTGCCTGAGAGCGACGC 

CGCAGCGCGTGTTACTCAGATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAAAGGCTCCACCAGTGG 

ATTAATGAGGACTGCTCCACACCGTGTTCCGGCTCGTGGCTAAGGGATGTTTGGGACTGGATATGCACGG 

TGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCAGCTACCGGGAGTCCCTTTTTTCTC 

GTGCCAACGCGGGTACAAGGGAGTCTGGCGGGGAGACGGCATCATGCAAACCACCTGCCCATGTGGAGCA 

CAGATCACCGGACATGTCAAAAACGGTTCCATGAGGATCGTCGGGCCTAAGACCTGCAGCAACACGTGGC 

ATGGAACATTCCCCATCAACGCATACACCACGGGCCCCTGCACACCCTCTCCAGCGCCAAACTATTCTAG 

GGCGCTGTGGCGGGTGGCCGCTGAGGAGTACGTGGAGGTCACGCGGGTGGGGGATTTCCACTACGTGACG 

GGCATGACCACTGACAACGTAAAGTGCCCATGCCAGGTTCCGGCTCCTGAATTCTTCTCGGAGGTGGACG 

GAGTGCGGTTGCACAGGTACGCTCCGGCGTGCAGGCCTCTCCTACGGGAGGAGGTTACATTCCAGGTCGG 

GCTCAACCAATACCTGGTTGGGTCACAGCTACCATGCGAGCCCGAACCGGATGTAGCAGTGCTCACTTCC 

ATGCTCACCGACCCCTCCCACATCACAGCAGAAACGGCTAAGCGTAGGTTGGCCAGGGGGTCTCCCCCCT 

CCTTGGCCAGCTCTTCAGCTAGCCAGTTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCACCATGT 

CTCTCCGGACGCTGACCTCATCGAGGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGGAACATCACCCGC 

GTGGAGTCGGAGAACAAGGTGGTAGTCCTGGACTCTTTCGACCCGCTTCGAGCGGAGGAGGATGAGAGGG 

AAGTATCCGTTCCGGCGGAGATCCTGCGGAAATCCAAGAAGTTCCCCGCAGCGATGCCCATCTGGGCGCG 

CCCGGATTACAACCCTCCACTGTTAGAGTCCTGGAAGGACCCGGACTACGTCCCTCCGGTGGTGCACGGG 
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TGCCCGTTGCCACCTATCAAGGCCCCTCCAATACCACCTCCACGGAGAAAGAGGACGGTTGTCCTAACAG 

CGACAGCGGCACGGCGACCGCCCTTCCTGACCAGGCCTCCGACGACGGTGACAAAGGATCCGACGTTGAG 
TCGTACTCCTCCATGCCCCCCCTTGAGGGGGAACCGGGGGACCCCGATCTCAGTGACGGGTCTTGGTCTA 
5 CCGTGAGCGAGGAAGCTAGTGAGGATGTCGTCTGCTGCTCAATGTCCTACACATGGACAGGCGCCTTGAT 
CACGCCATGCGCTGCGGAGGAAAGCAAGCTGCCCATCAACGCGTTGAGCAACTCTTTGCTGCGCCACCAT 
AACATGGTTTATGCCACAACATCTCGCAGCGCAGGCCTGCGGCAGAAGAAGGTCACCTTTGACAGACTGC 
AAGTCCTGGACGACCACTACCGGGACGTGCTCAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTAA 
ACTCCTATCCGTAGAGGAAGCCTGCAAGCTGACGCCCCCACATTCGGCCAAATCCAAGTTTGGCTATGGG 

10 GCAAAGGACGTCCGGAACCTATCCAGCAAGGCCGTTAACCACATCCACTCCGTGTGGAAGGACTTGCTGG 
AAGACACTGTGACACCAATTGACACCACCATCATGGCAAAAAATGAGGTTTTCTGTGTCCAACCAGAGAA 
AGGAGGCCGTAAGCCAGCCCGCCTTATCGTATTCCCAGATCTGGGAGTCCGTGTATGCGAGAAGATGGCC 
CTCTATGATGTGGTCTCCACCCTTCCTCAGGTCGTGATGGGCTCCTCATACGGATTCCAGTACTCTCCTG 
GGCAGCGAGTCGAGTTCCTGGTGAATACCTGGAAATCAAAGAAAAACCCCATGGGCTTTTCATATGACAC 

15 TCGCTGTTTCGACTCAACGGTCACCGAGAACGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGAC 
TTGGCCCCCGAAGCCAGACAGGCCATAAAATCGCTCACAGAGCGGCTTTATATCGGGGGTCCTCTGACTA 
ATTCAAAAGGGCAGAACTGCGGTTATCGCCGGTGCCGCGCGAGCGGCGTGCTGACGACTAGCTGCGGTAA 
CACCCTCACATGTTACTTGAAGGCCTCTGCAGCCTGTCGAGCTGCGAAGCTCCAGGACTGCACGATGCTC 
GTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGCGCGGGAACCCAAGAGGACGCGGCGAGCCTACGAG 

20 TCTTCACGGAGGCTATGACTAGGTACTCTGCCCCCCCCGGGGACCCGCCCCAACCAGAATACGACTTGGA 
GCTGATAACATCATGTTCCTCCAATGTGTCGGTCGCCCACGATGCATCAGGCAAAAGGGTGTACTACCTC 
ACCCGTGATCCCACCACCCCCCTCGCACGGGCTGCGTGGGAAACAGCTAGACACACTCCAGTTAACTCCT 
GGCTAGGCAACATTATCATGTATGCGCCCACTTTGTGGGCAAGGATGATTCTGATGACTCACTTCTTCTC 
CATCCTTCTAGCACAGGAGCAACTTGAAAAAGCCCTGGACTGCCAGATCTACGGGGCCTGTTACTCCATT 

25 GAGCCACTTGACCTACCTCAGATCATTGAACGACTCCATGGCCTTAGCGCATTTTCACTCCATAGTTACT 
CTCCAGGTGAGATCAATAGGGTGGCTTCATGCCTCAGGAAACTTGGGGTACCACCCTTGCGAGTCTGGAG 
ACATCGGGCCAGGAGCGTCCGCGCTAGGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTAC 
CTCTTCAACTGGGCAGTGAAGACCAAACTCAAACTCACTCCAATCCCGGCTGCGTCCCAGCTGGACTTGT 
CCGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGTGCCCGACCCCGCTG 

30 GTTCATGCTGTGCCTACTCCTACTTTCTGTAGGGGTAGGCATCTACCTGCTCCCCAACCGATGAACGGGG 
AGCTAAACACTCCAGGCCAATAGGCCATTTCCTGTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCT 
TTTCCTTCTTTTTCCCTTTTTCTTTCTTCCTTCTTTAATGGTGGCTCCATCTTAGCCCTAGTCACGGCTA 
GCTGTGAAAGGTCCGTGAGCCGCATGACTGCAGAGAGTGCTGATACTGGCCTCTCTGCAGATCATGT 
(SEQ ID NO: 6688) 

35 

gi|306286|gb|M96362.1|HPCUNKCDS Hepatitis C virus mRNA, complete cds 
TGCCAGCCCCCGATTGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGC 
AGAAAGCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCC 
ATAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCC 

40 GCTCAATGCCTGGAGATTTGGGCGTGCCCCCGCGAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGC 
CTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCAC 
GAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACAGGATATTAAGTTCCCGGGC 
GGTGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCCAGGTTGGGTGTGCGCGCGACTA 
GGAAGACTTCCGAGCGGTCGCAACCTCGTGGAAGGCGACAGCCTATCCCCAAGGCTCGCCGGCCCGAGGG 

45 CAGGGCCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGCTTGGGGTGGGCAGGATGG 
CTCCTGTCACCCCGCGGCTCCCGGCCTAGTTGGGGCCCCACGGACCCCCGGCGTAAGTCGCGTAATTTGG 
GTAAGGTCATCGACACCCTCACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTCGTCGGCGCCCC 
CCTAGGGGGCGTTGCCAGGGCCCTGGCACATGGTGTCCGGGTGCTGGAGGACGGCGTGAACTATGCAACA 
GGGAATCTGCCCGGTTGCTCTTTCTCTATCTTCCTCTTGGCTCTGCTGTCTTGTTTGACCACCCCAGTTT 

50 CCGCTTATGAAGTGCGTAACGCGTCCGGGATGTACCATGTCACGAACGACTGCTCCAACTCAAGCATTGT 
GTATGAGGCAGCGGACATGATCATGCACACTCCCGGGTGCGTGCCCTGCGTTCGGGAGGACAACTCCTCC 
CGTTGCTGGGTGGCACTTACTCCCACGCTCGCGGCCAGGAATGCCAGCGTCCCCACTACGACATTGCGAC 
GCCATGTCGACTTGCTCGTTGGGGTAGCTGCTTTCTGTTCCGCTATGTACGTGGGGGACCTCTGCGGATC 
TGTTTTCCTTGTTTCCCAGCTGTTCACCTTTTCGCCTCGCCGGCATGAGACGGTACAGGACTGCAACTGC 

55 TCAATCTATCCCGGCCGCGTATCAGGTCACCGCATGGCCTGGGATATGATGATGAACTGGTCGCCTACAA 
CAGCCCTAGTGGTATCGCAGCTACTCCGGATCCCACAAGCTGTCGTGGACATGGTGACAGGGTCCCACTG 
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GGGAATCCTGGCGGGCCTTGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTCTTAATTGCGATGCTA 
CTCTTTGCCGGCGTTGACGGAACCACCCACGTGACAGGGGGGGCGCAAGGTCGGGCCGCTAGCTCGCTAA 
CGTCCCTCTTTAGCCCTGGGCCGGTTCAGCACCTCCAGCTCATAAACACCAACGGCAGCTGGCATATCAA 
CAGGACCGCCCTGAGCTGCAATGACTCCCTCAACACTGGGTTTGTTGCCGCGCTGTTCTACAAATACAGG 
5 TTCAACGCGTCCGGGTGCCCGGAGCGCTTGGCCACGTGCCGCCCCATTGATACATTCGCGCAGGGGTGGG 
GTCCCATCACTTACACTGAGCCTCATGATTTGGATCAGAGGCCCTATTGCTGGCACTACGCGCCTCAACC 
GTGTGGTATTGTGCCCACGTTGCAGGTGTGTGGCCCAGTATACTGCTTCACCCCGAGTCCTGTTGCGGTG 
GGGACTACCGATCGTTTCGGTGCCCCTACATACAGATGGGGGGCAAATGAGACGGACGTGCTGCTCCTTA 
ACAACGCCGGGCCGCCGCAAGGCAACTGGTTCGGCTGTACATGGATGAATGGCACTGGGTTCACCAAGAC 

10 ATGTGGGGGCCCCCCGTGTAACATCGGGGGGGTCGGCAACAATACCTTGACCTGCCCCACGGACTGCTTC 
CGAAAGCACCCCGGGGCCACTTACACCAAATGCGGTTCGGGGCCTTGGTTAACACCCAGGTGCTTAGTCG 
ACTACCCGTACAGGCTCTGGCATTACCCCTGCACTGTCAACTTTACCATCTTTAAGGTTAGGATGTACGT 
GGGGGGCGCGGAGCACAGGCTCGACGCCGCATGCAACTGGACTCGGGGAGAGCGTTGTGACCTGGAGGAC 
AGGGATAGGTCAGAGCTTAGCCCGCTGCTGCTGTCTACAACAGAGTGGCAGGTACTGCCCTGTTCCTTCA 

15 CAACCCTACCGGCTCTGTCCACTGGTTTGATTCATCTCCATCAGAACATCGTGGACATACAATACCTGTA 
CGGTATAGGGTCGGCGGTTGTCTCCTTTGCGATCAAATGGGAGTATATTGTGCTGCTCTTCCTTCTTCTG 
GCGGACGCGCGCGTCTGCGCTTGCTTGTGGATGATGCTGCTGGTAGCGCAAGCCGAGGCCGCCTTAGAGA 
ACCTGGTGGTCCTCAATGCAGCGTCCGTGGCCGGAGCGCATGGCATTCTTTCCTTCATTGTGTTCTTCTG 
TGCTGCCTGGTACATCAAGGGCAGGCTGGTTCCCGGAGCGGCATACGCCCTCTATGGCGTATGGCCGCTG 

20 CTTCTGCTTCTGCTGGCGTTACCACCACGGGCGTACGCCATGGACCGGGAGATGGCCGCATCGTGCGGAG 
GCGCGGTTTTTGTAGGTCTGGTACTCTTGACCTTGTCACCACACTATAAAGTGTTCCTTGCCAGGTTCAT 
ATGGTGGCTACAATATCTCATCACCAGAACCGAAGCGCATCTGCAAGTGTGGGTCCCCCCTCTCAACGTT 
CGGGGGGGTCGCGATGCCATCATCCTCCTCACATGCGTGGTCCACCCAGAGCTAATCTTTGACATCACAA 
AATATTTGCTCGCCATATTCGGCCCGCTCATGGTGCTCCAGGCCGGCATAACTAGAGTGCCGTACTTCGT 

25 GCGCGCACAAGGGCTCATTCGTGCATGCATGTTGGCGCGGAAAGTCGTGGGGGGTCATTACGTCCAAATG 
GTCTTCATGAAGCTGGCCGCACTAGCAGGTACGTACGTTTATGACCATCTTACTCCACTGCGAGATTGGG 
CTCACACGGGCTTACGAGACCTTGCAGTGGCAGTAGAGCCCGTTGTCTTCTCTGACATGGAGACCAAAGT 
CATCACCTGGGGGGCAGACACCGCGGCGTGCGGGGACATCATCTTGGCCCTGCCTGCTTCCGCCCGAAGG 
GGGAAGGAGATACTTCTGGGACCGGCCGATAGTCTTGAAGGACAGGGGTGGCGACTCCTTGCGCCCATCA 

30 CGGCCTACTCCCAACAAACGCGAGGCCTGCTTGGTTGCATCATCACTAGCCTTACAGGCCGGGACAAGAA 
CCAGGTTGAGGGGGAGGTTCAAGTGGTTTCCACCGCAACACAATCTTTCCTGGCGACCTGCATCAATGGC 
GTGTGTTGGACTGTCTTCCACGGCGCCGGCTCAAAGACCCTAGCCGGCCCAAAGGGTCCAATCACCCAAA 
TGTACACCAATGTAGACCAGGACCTTGTTGGCTGGCCGGCACCTCCTGGGGCGCGTTCCCTGACACCATG 
CACTTGCGGCTCCTCGGACCTTTACCTGGTCACGAGACATGCTGATGTCATTCCGGTGCGCCGGCGGGGT 

35 GACGGTAGGGGGAGCCTACTCCCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCCTCGGGTGGTCCACTGC 
TCTGCCCTTCGGGGCACGCTGTCGGCATACTTCCGGCTGCTGTATGCACCCGGGGGGTTGCCATGGCGGT 
GGAATTCATACCCGTTGAGTCTATGGAAACTACTATGCGGTCTCCGGTCTTCACGGACAATCCGTCTCCC 
CCGGCTGTACCGCAGACATTCCAAGTGGCCCACTTACACGCTCCCACCGGCAGCGGCAAGAGCACTAGGG 
TGCCGGCTGCATATGCAGCCCAAGGGTACAAGGTGCTCGTCCTAAATCCGTCCGTCGCCGCCACCTTGGG 

40 TTTTGGGGCGTATATGTCCAAGGCACATGGTATCGACCCCAACCTTAGAACTGGGGTAAGGACCATCACC 
ACAGGTGCCCCTATCACATACTCCACCTATGGCAAGTTCCTTGCCGACGGTGGCGGCTCCGGGGGCGCCT 
ATGACATCATAATGTGTGATGAGTGCCACTCAACTGACTCGACTACCATTTATGGCATCGGCACAGTCCT 
GGACCAAGCGGAGACGGCTGGAGCGCGGCTCGTGGTGCTCTCCACCGCTACGCCTCCGGGATCGGTCACC 
GTGCCACACCTCAATATCGAGGAGGTGGCCCTGTCTAATACTGGAGAGATCCCCTTCTACGGCAAAGCCA 

45 TTCCCATCGAGGCTATCAAGGGGGGAAGGCATCTCATTTTCTGCCATTCCAAGAAGAAGTGTGACGAACT 
CGCCGCAAAGCTGTCAGGCCTCGGACTCAATGCCGTAGCGTATTACCGGGGTCTTGACGTGTCCGTCATA 
CCGACCAGCGGAGACGTTGTTGTCGTGGCGACGGACGCTCTAATGACGGGCTTTACCGGCGACTTTGACT 
CAGTGATCGACTGTAATACGTGTGTCACCCAGACAGTCGATTTCAGCTTGGACCCCACCTTCACCATTGA 
GACGACGACCGTGCCCCAAGACGCAGTGTCGCGCTCGCAGAGGCGAGGCAGGACTGGTAGGGGCAGGGCT 

50 GGCATATACAGGTTTGTGACTCCAGGAGAACGGCCCTCGGGCATGTTCGATTCTTCGGTCCTGTGTGAGT 
GTTATGACGCGGGTTGTGCGTGGTACGAACTCACGCCCGCTGAGACCTCGGTTAGGTTGCGGGCGTACCT 
AAACACACCAGGGTTGCCCGTCTGCCAGGACCATCTGGAGTTCTCGGAGGGTGTCTTCACAGGCCTCACC 
CACATAGATGCCCACTTCTTATCCCAGACTAAACAGGCAGGAGAGAACTTCCCCTACTTGGTAGCATACC 
AGGCTACAGTGTGCGCCAGGGCTCAAGCCCCACCTCCATCGTGGGATGAAATGTGGAGGTGTCTCATACG 

55 GCTGAAACCTACGCTGCACGGGCCAACACCCCTGCTGTATAGGTTAGGAGCCGTCCAAAATGAGGTCACC 
CTCACACACCCCATAACCAAATTCATCATGACATGTATGTCGGCTGACCTGGAGGTCGTCACCAGCACCT 
GGGTGCTGGTAGGCGGAGTCCTCGCAGCTCTGGCCGCGTACTGCCTGACAACAGGCAGCGTGGTCATTGT 
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GGGCAGGATCATCCTGTCCGGGAAGCCGGCTATCATCCCCGATAGGGAAGTTCTCTACCAGGAGTTCGAC 
GAGATGGAGGAGTGTGCCTCACACCTCCCTTACTTCGAACAGGGAATGCAGCTCGCCGAGCAATTCAAAC 
AGAAGGCGCTCGGGTTGCTGCAAACAGCCACCAAGCAGGCGGAGGCTGCTGCTCCCGTGGTGGAGTCCAA 
GTGGCGAGCCCTTGAGACCTTCTGGGCGAAGCACATGTGGAACTTCATTAGTGGGATACAGTACTTGGCA 
5 GGCTTGTCCACTCTGCCTGGGAACCCCGCAATACGATCACCGATGGCATTCACAGCCTCCATCACCAGCC 
CGCTCACCACCCAGCATACCCTCTTGTTTAACATCTTGGGGGGATGGGTGGCTGCCCAACTCGCCCCCCC 
CAGCGCTGCCTCAGCTTTCGTGGGCGCCGGCATCGCTGGAGCCGCTGTTGGCACGATAGGCCTTGGGAAG 
GTGCTTGTGGACATTCTGGCAGGTTATGGAGCAGGGGTGGCGGGCGCACTTGTGGCCTTTAAGATCATGA 
GCGGCGAGATGCCTTCAGCCGAGGACATGGTCAACTTACTCCCTGCCATCCTTTCTCCCGGTGCCCTGGT 

'10 CGTCGGGATTGTGTGTGCAGCAATACTGCGTCGGCATGTGGGCCCAGGGGAAGGGGCTGTGCAGTGGATG 
AACCGGCTGATAGCGTTCGCCTCGCGGGGTAACCACGTCTCCCCCAGGCACTATGTGCCAGAGAGCGAGC 
CTGCAGCGCGTGTTACCCAGATCCTTTCCAGCCTCACCATCACTCAGCTGTTGAAGAGACTCCACCAGTG 
GATTAATGAGGACTGCTCTACGCCATGCTCCAGCTCGTGGCTAAGGGAGATTTGGGACTGGATCTGCACG 
GTGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCGATTACCGGGAGTCCCTTTTTTCT 

15 CATGCCAACGCGGGTATAAGGGAGTCTGGCGGGGGGACGGCATCATGCACACCACCTGCCCATGCGGAGC 
ACAGATCACCGGACACGTCAAAAACGGTTCCATGAGGATCGTTGGGCCTAAAACCTGCAGCAACACGTGG 
TACGGGACATTCCCCATCAACGCGTACACCACGGGCCCCTGCACACCCTCCCCGGCGCCAAACTATTCCA 
AGGCATTGTGGAGAGTGGCCGCTGAGGAGTACGTGGAGGTCACGCGGGTGGGAGATTTTCACTACGTGAC 
GGGCATGACCACTGACAACGTGAAGTGTCCATGCCAGGTTCCGGCCCCCGAATTCTTCACGGAGGTGGAT 

20 GGAGTGCGGTTGCACAGGTACGCTCCGGCGTGCAGACCTCTCCTACGGGAGGAGGTCGTATTCCAGGTCG 
GGCTCCACCAGTACCTGGTCGGGTCACAGCTCCCATGCGAGCCCGAACCGGATGTAGCAGTGCTCACTTC 
CATGCTCACTGACCCCTCCCACATTACAGCAGAGACGGCTAAGCGTAGGCTGGCCAGGGGGTCTCCCCCC 
TCGTTGGCCAGCTCTTGAGCTAGCCAGTTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCATCATG 
ACTCCCCGGACGCTGACCTCATTGAGGCCAACCTCTTGTGGCGGCAAGAGATGGGCGGGAACATCACCCG 

25 CGTGGAGTCAGAGAATAAGGTGGTAATCCTGGACTCTTTCGACCCGCTCCGAGCGGAGGATGATGAGGGG 
GAAATATCCGTTCCGGCGGAGATCCTGCGGAAATCCAGGAAATTCCCCCCAGCGCTGCCCATATGGGCGC 
CGCCGGATTACAACCCTCCGCTGCTAGAGTCCTGGAAGGACCCGGACTACGTTCCTCCGGTGGTACACGG 
GTGCCCGTTGCCGCCCACCAAGGCCCCTCCAATACCACCTCCACGGAGGAAGAGGACGGTTGTCCTGACA 
GAATCCACCGTGTCTTCTGCCTTGGCGGAGCTCGCTACTAAGACCTTCGGCAGCTCCGGATCGTCGGCCA 

30 TCGACAGCGGTACGGCGACCGCCCCTCCTGACCAAGCCTCCGGTGACGGCGACAGAGAGTCCGACGTTGA 
GTCGTTCTCCTCCATGCCCCCCCTTGAGGGAGAGCCGGGGGACCCCGATCTCAGCGACGGATCTTGGTCC 
ACCGTGAGCGAGGAGGCTAGTGAGGACGTCGTCTGCTGTTCGATGTCCTACACATGGACAGGCGCCCTGA 
TCACGCCATGCGCTGCGGAGGAAAGCAAGTTGCCCATCAACCCGTTGAGCAATTCTTTGCTACGTCACCA 
CAACATGGTCTATGCTACAACATCCCGCAGCGCAGGCCTGCGGCAGAAGAAGGTCACCTTTGACAGACTG 

35 CAAGTCCTGGACGACCACTACCGGGACGTGCTTAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTA 
AACTTCTATCTGTAGAAGAAGCCTGCAAACTGACGCCCCCACATTCGGCCAAATCCAAATTTGGCTACGG 
GGCGAAGGACGTCCGGAGCCTATCCAGCAGGGCCGTTACCCACATCCGCTCCGTGTGGAAGGACCTGCTG 
GAAGACACTGAAACACCAATTAGCACTACCATCATGGCAAAAAATGAGGTTTTCTGTGTCCAACCAGAGA 
AGGGAGGCCGCAAGCCAGCTCGCCTTATCGTGTTCCCAGATCTGGGAGTTCGTGTATGCGAGAAGATGGC 

40 CCTTTATGACGTGGTCTCCACCCTTCCTCAGGCCGTGATGGGCTCCTCATACGGATTCCAGTACTCTCCT 
AAGCAGCGGGTCGAGTTCCTGGTGAATACCTGGAAATCAAAGAAATGCCCCATGGGCTTCTCATATGACA 
CCCGCTGTTTTGACTCAACGGTCACTGAGAATGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGA 
CTTGGCCCCCGAAGCCAAACTGGCCATAAAGTCGCTCACAGAGCGGCTCTATATCGGGGGTCCCCTGACT 
AATTCAAAAGGGCAGAACTGCGGTTACCGCCGGTGCCGCGCGAGCGGCGTGCTGACGACTAGCTGCGGTA 

45 ATACCCTCACATGTTACCTGAAAGCCACTGCGGCCTGTCGAGCTGCGAAGCTCCGGGACTGCACGATGCT 
CGTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGCGCGGGAACCCAAGAGGATGCGGCGAGCCTACGA 
GTCTTCACGGAGGCTATGACTAGGTACTCTGCCCCCCCTGGGGACCCGCCTCAACCGGAATACGACTTGG 
AGTTGATAACATCATGTTCCTCCAATGTGTCGGTCGCACACGATGCATCTGGTAAAAGGGTGTACTACCT 
CACCCGTGACCCTACCACCCCCCTTGCACGGGCTGCGTGGGAGACAGCTAGACACACTCCAGTCAACTCC 

50 TGGCTAGGCAACATCATCATGTATGCGCCCACCTTATGGGCAAGGATGATTCTGATGACTCATTTCTTCT 
CCATCCTTCTAGCTCAGGAGCAACTTGAAAAAACCCTAGATTGTCAGATCTACGGGGCCTGTTACTCCAT 
TGAACCACTTGATCTACCTCAGATCATTGAGCGACTCCATGGTCTTAGCGCATTTTCACTCCATAGTTAC 
TCTCCAGGCGAGATCAATAGGGTGGCTTCATGCCTCAGAAAACTTGGGGTACCACCCTTGCGAGCCTGGA 
GACATCGGGCCAGAAGTGTCCGCGCTAAGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTA 
55 CCTCTTCAACTGGGCGGTGAGGACCAAGCTCAAACTCACTCCAATCCCAGCCGCGTCCCGGTTGGACTTG 
TCCGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGTGCCCGACCCCGCT 
GGTTCATGTTGTGCCTACTCCTACTTTCCGTGGGGGTAGGCATCTACCTGCTCCCCAACCGATGAATGGG 
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GAGCTAAACACTCCAGGCCAATAGGCCGTTTCTC (SEQ ID NO: 6689) 



giI329739|gb|L02836.1|HPCCGENOM Hepatitis C China virus complete genome 
ATTGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCAGAAAGCGTCTA 
GCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATAGTGGTCTGC 
GGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCCGCTCAATGCCTG 
GAGATTTGGGCGTGCCCCCGCGAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCCTTGTGGTACTG 
CCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCACGAATCCTAAACC 
TCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACAGGACGTCAAGTTCCCGGGCGGTGGTCAGATC 
GTTGGTGGAGTTTACCTGTTGCCGCGCAGGGGCCCCAGGTTGGGTGTGCGCGCGACTAGGAAGACTTCCG 
AGCGGTCGCAACCTCGTGGAAGGCGACAACCTATCCCCAAGGCTCGCCGACCCGAGGGCAGGACCTGGGC 
TCAGCCCGGGTATCCTTGGCCCCTCTATGGCAATGAGGGCTTTGGGTGGGCAGGATGGCTCCTGTCACCC 
CGCGGCTCCCGGCCTAGTTGGGGCCCCACGGACCCCCGGCGTAGGTCGCGTAATTTGGGTAAGGTCATCG 
ATACCCTCACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTCGTCGGCGCCCCCTTGGGGGGCGC 
TGCCAGGGCCCTGGCACATGGTGTCCGGGTTCTGGAGGACGGCGTGAACTATGCAACAGGGAATTTGCCC 
GGTTGCTCTTTCTCTATCTTCCTTTTAGCCTTGCTATCCTGTTTGACCACCCCAGCTTCCGCTTACGAAG 
TGCGTAACGTGTCCGGGATATACCATGTCACGAACGACTGCTCCAACTCAAGCATTGTGTATGAGGCAGC 
GGACCTGATCATGCATACCCCTGGGTGCGTGCCCTGCGTTCGGGAAGGCAACTCCTCCCGTTGCTGGGTA 
GCGCTCACTCCCACGCTCGCGGCCAGGAACGCCACGATCCCCACTGCGACAGTACGACGGCATGTCGATC 
TGCTCGTTGGGGCGGCTGCTTTCTCTTCCGCCATGTACGTGGGGGATCTCTGCGGATCTGTTTTCCTTGT 
CTCTCAGCTGTTCACCTTCTCGCCTCGCCGGTATGAGACAATACAGGACTGCAATTGCTCAATCTATCCC 
GGCCACGTAACAGGTCACCGCATGGCTTGGGATATGATGATGAACTGGTCGCCTACAACAGCTCTAGTGG 
TGTCGCAGTTACTCCGGATCCCTCAAGCCGTCATGGACATGGTGGTGGGGGCCCACTGGGGAGTCCTGGC 
GGGCCTTGCCTACTATGCCATGGTGGGGAATTGGGCTAAGGTTTTGATTGTGATGCTACTCTTCGCCGGC 
GTTGATGGGGATACCTACGCGTCTGGGGGGGCGCAGGGCCGCTCCACCCTCGGGTTCACGTCCCTCTTTA 
CACCTGGGGCCTCTCAGAAGATCCAGCTTATAAATACCAATGGTAGCTGGCATATCAACAGGACTGCCCT 
GAACTGCAATGACTCCCTCAATACTGGGTTTCTTGCCGCGCTGTTCTATACACACAGGTTCAACGCGTCC 
GGATGCGCAGAGCGCATGGCCAGCTGCCGCCCCATTGATACATTCGATCAGGGCTGGGGCCCCATCACTT 
ATACTGAGCCTGATAGCTCGGACCAGAGGCCTTATTGCTGGCACTACGCGCCTCGAAAGTGCGGCATCGT 
AGCTGCGTCGGAGGTGTGCGGTCCAGTGTATTGTTTCACCCCAAGCCCTGTCGTCGTGGGGACGACCGAT 
CGTTTCGGTGTCCCCACATATAGCTGGGGGGAGAATGAGACAGACGTGCTGCTCCTCAACAACACGCGGC 
CGCCGCAAGGCAACTGGTTTGGCTGTACATGGATGAATGGCACTGGGTTCACCAAGACGTGCGGGGGGCC 
TCCGTGTAACATCGGGGGGGTCGGCAACAACACTTTGACTTGCCCCACGGATTGCTTTCGGAAGCACCCC 
GAGGCTACGTATACAAGGTGTGGTTCGGGGCCTTGGCTGACACCTAGGTGCTTAGTTGACTACCCATACA 
GGCTCTGGCACTACCCCTGCACTGTCAACTTTGCCATCTTCAAAGTTAGGATGTATGTGGGGGGCGTGGA 
GCACAGGCTCGATGCTGCATGCAACTGGACTCGAGGAGAGCGCTGTAACTTGGAGGACAGGGATAGATCA 
GAACTCAGCCCGCTGCTACTGTCTACAACAGAGTGGCAGATACTACCCTGCGCCTTCACCACCCTACCGG 
CTCTGTCCACTGGTTTAATCCATCTCCATCAGAACATCGTGGACGTGCAATACCTGTACGGTATAGGGTC 
AGCGGTTGCCTCCTTTGCAATTAAATGGGAGTATGTCTTGTTGCTTTTCCTTCTACTAGCAGACGCGCGC 
GTATGTGCCTGCTTGTGGATGATGCTGCTGATAGCCCAGGCCGAGGCCGCCTTAGAGAACCTGGTGGTCC 
TCAATGCGGCGTCCGTGGCCGACGCGCATGGCATCCTCTCCTTCCTTGTGTTCTTTTGTGCCGCCTGGTA 
CATTAAGGGCAGGCTGGTCCCCGGGGCAGCATACGCTTTCTACGGCGTGTGGCCGCTGCTCCTGCTCCTG 
CTGACATTACCACCACGAGCTTACGCCATGGACCGGGAGATGGCTGCATCGTGCGGAGGCGCGGTTTTTG 
TAGGTCTGGTATTCCTGACTTTGTCACCATACTACAAGGTGTTCCTCGCTAGGCTCATATGGTGGTTGCA 
ATACTTCCTCACCATAGCCGAGGCGCACCTGCAAGTGTGGATCCCCCCTCTCAACATTCGAGGGGGCCGC 
GATGCCATCATCCTCCTCACGTGTGCAATCCACCCAGAGTCAATCTTTGACATCACCAAACTCCTGCTCG 
CCACGCTCGGTCCGCTCCTGGTGCTTCAGGCTGGCATAACTAGAGTGCCGTACTTTGTGCGCGCTCATGG 
GCTCATTCGCGCGTGCATGCTATTGCGGAAAGTTGCTGGGGGTCATTATGTCCAAATGGCCTTCATGAAG 
CTGGGCGCACTGACAGGTACGTACGTCTATAACCATCTTACTCCGCTGCAGTATTGGCCACGCGCGGGTT 
TACGAGAACTCGCGGTGGCAGTAGAGCCCGTCATCTTCTCTGACATGGAGACCAAGATTATCACCTGGGG 
GGCAGACACTGCAGCGTGTGGAGACATCATCTTGGGTTTACCCGTCTCCGCCCGAAGGGGAAAGGAGATA 
CTCCTGGGGCCGGCCGATAGTCTTGAAGGGCAGGGGTGGCGACTCCTTGCGCCCATCACGGCCTACTCCC 
AACAGACGCGGGGCTTACTTGGTTGCATCATCACTAGCCTCACAGGCCGAGACAAGAACCAGGTCGAGGG 
GGAGGTTCAAGTGGTCTCCACCGCAACACAATCTTTCCTGGCGACCTGCATCAACGGTGTGTGTTGGACT 

345 



WO 2004/080406 



PCT/US2004/007070 



GTCTATCATGGCGCCGGCTCAAAAACCTTAGCCGGCCCAAAGGGCCCAATCACCCAAATGTACACCAATG 
TAGACCAGGACCTCGTCGGCTGGCACCGGCCCCCCGGGGCGCGTTCCCTAACACCATGCACCTGCGGCAG 
CTCGGACCTTTACTTGGTCACGAGACATGCTGATGTCATTCCGGTGCGCCGTCGAGGCGACAGTAGGGGG 
AGTTTACTCTCCCCCAGGCCTGTCTCCTACCTGAAGGGCTCGTCGGGGGGCCCACTGCTCTGCCCCTTCG 
5 GGCACGTTGCAGGCATCTTCCGGGCTGCTGTGTGCACCCGGGGGGTTGCGAAGGCGGTGGATTTTATACC 
CGTTGAGACCATGGAAACTACCATGCGGTCCCCGGTCTTCACGGACAACTCATCCCCTCCTGCCGTACCG 
CAGACATTCCAAGTGGCCCATCTACACGCTCCCACTGGCAGCGGCAAAAGCACCAAGGTGCCGGCTGCAT 
ATGCAGCCCAAGGGTACAAGGTACTTGTCTTGAACCCGTCTGTTGCCGCCACTTTAGGTTTTGGGGCGTA 
TATGTCTAAGGCACATGGTGTCGACCCCAACATTAGAACCGGGGTAAGGACCATCACCACGGGCGCCCCC 

10 ATCACATACTCTACCTATGGCAAGTTCCTTGCTGATGGTGGTTGCTCTGGGGGTGCCTATGACATTATAA 
TATGTGATGAGTGCCATTCAACTGACTCGACTACCATCTTGGGCATCGGCACGGTCCTGGACCAAGCGGA 
GACGGCTGGAGCGCGGCTTGTCGTGCTCGCCACCGCTACGCCTCCGGGATCGGTCACCGTGCCACATCCA 
AACATCGAGGAGGTGGCCCTGTCCAATACTGGAGAGATCCCCTTCTATGGTAAAGCCATCCCCATCGAAG 
CCATCAGGGGGGGAAGGCATCTCATTTTCTGCCACTCCAAGAAGAAGTGTGACGAGCTTGCTGCAAAGCT 

15 ATCATCGCTCGGGCTCAACGCTGTGGCGTACTACCGGGGGCTTGATGTGTCCGTCATACCATCTAGCGGA 
GACGTCGTTGTCGTGGCAACGGACGCTCTAATGACGGGCTTTACGGGCGACTTTGACTCAGTGATCGACT 
GTAACACATGTGTTACCCAAACAGTCGATTTCAGCTTGGACCCCACCTTCACCATCGAGACAACGACCGT 
GCCCCAAGACGCGGTGTCGCGCTCGCAGCGGCGAGGTAGGACTGGCAGGGGTAGGGAAGGCATCTACAGG 
TTTGTTACTCCAGGAGAACGGCCCTCGGGCATGTTCGACTCCTCAGTCCTGTGTGAGTGCTATGACGCGG 

20 GCTGTGCTTGGTACGAGCTCACGCCGGCTGAGACCACGGTTAGGTTGCGGGCTTACCTAAATACACCAGG 
GTTGCCCGTCTGCCAGGACCATCTGGAGTTCTGGGAGGGCGTCTTCACAGGTCTCACCCATATAGACGCT 
CACTTTCTGTCCCAGACCAAGCAAGCAGGAGACAACTTCCCCTACCTGGTAGCATACCAAGCTACAGTGT 
GTGCCAAGGCTCAGGCCCCACCTCCATCGTGGGATCAAATGTGGAAGTGCCTCACACGGCTAAAGCCTAC 
GCTGCAGGGACCAACACCCCTGCTGTATAGGCTAGGAGCCGTCCAAAATGAGGTCACCCTCACACACCCC 

25 ATAACTAAATACATCATGACATGCATGTCGGCTGACCTGGAGGTCGTCACCAGCACCTGGGTGCTGGTGG 
GCGGAGTCCTTGCAGCTCTGGCCGCGTATTGCCTGACAACGGGCAGCGTGGTCATTGTGGGTAGGATTGT 
CTTGTCCGGAAGTCCGGCTATTGTTCCTGACAGGGAAGTTCTTTACCAAGACTTCGACGAGATGGAAGAG 
TGTGCCTCACACCTCCCTTACATCGAACAGGGAATGCAGCTCGCCGAGCAGTTCAAGCAGAAGGCGCTCG 
GGTTGCTGCAAACAGCCACCAAGCAAGCGGAGGCTGCTGCTCCCGTGGTGGAGTCCAAGTGGCGAGCCCT 

30 CGAGACATTTTGGGAAAAACACATGTGGAATTTCATCAGCGGGATACAGTACTTAGCAGGCTTATCCACT 
CTGCCTGGGAACCCCGCAATGGCATCACTGATGGCATTCACAGCTTCTATCACCAGCCCGCTCACTACCC 
AACACACCCTCCTGTTTAACATCTTGGGTGGATGGGTGGCTGCCCAACTCGCTCCCCCCAGCGCCGCTTC 
GGCCTTTGTGGGCGCCGGCATTGCCGGTGCGGCTGTTGGCAGCATAGGCCTTGGGAAGGTGCTTGTGGAC 
ATCCTGGCGGGTTATGGGGCGGGGGTGGCTGGCGCACTCGTGGCCTTTAAGGTCATGAGTGGCGAAATGC 

35 CCTCCACTGAGGACCTGGTTAATTTACTCCCTGCCATCCTCTCTCCTGGTGCCCTAGTCGTCGGGGTCGT 
GTGCGCAGCAATACTGCGCCGACACGTGGGCCCGGGAGAGGGGGCTGTGCAGTGGATGAACCGGCTGATA 
GCGTTCGCTTCGCGGGGTAACCATGTCTCCCCCACGCACTATGTGCCTGAAAGTGACGCCGCAGCGCGTG 
TTACCCAGATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAAAGACTTCACCAGTGGATTAATGAGGA 
CTGTTCCACACCATGCTCCGGCTCGTGGCTAAGGGATGTTTGGGATTGGATATGCACGGTGTTGACCGAT 

40 TTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCGGTTGCCCGGAGTCCCTTTCCTCTCATGCCAACGCG 
GGTACAAGGGAGTCTGGCGGGGGGACGGTATTATGCAAACCACCTGTCCATGTGGAGCACAGATTACTGG 
ACATGTCAAAAACGGTTCCATGAGAATCGTTGGGCCTAAGACTTGTAGCAACACGTGGCATGGAACATTC 
CCCATCAACGCGTACACCACGGGCCCCTGCACACCCTCCCCGGCGCCGAACTATTCCAGGGCGCTGTGGC 
GGGTGGCTCCTGAGGAGTACGTGGAGGTTACGCGGGTGGGGGATTTCCACTACGTGACGGGCATGACCAC 

45 CGACAACGTGAAATGCCCATGCCAAGTCCCGGCCCCTGAATTCTTCACGGAGGTGGATGGAGTACGGCTG 
CACAGGTACGCTCCGGCGTGCAAACCTCTCCTACGGGAGGAGGTCGTGTTCCAGGTCGGGCTCAACCAAT 
ACCTGGTTGGATCACAGCTCCCATGCGAGCCCGAGCCGGACGTAACAGTGCTCACTTCCATGCTTACCGA 
CCCCTCCCACATCACAGCAGAGACGGCCAAGCGTAGGCTGGCCAGGGGGTCTCCCCCCTCCTTGGCCAGC 
. TCTTCAGCTAGCCAATTGTCTGCGCCTTCTTTGAAGGCGACATGTACTACCCATCATGACTCCCCGGACG 

50 CCGACCTCATTGAGGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGAAACATCACCCGTGTGGAGTCAGA 
AAATAAGGTAGTGATCCTGGACTCTTTCGACCCGCTTCGGGCGGAGGAGGACGAGAGGGAAGTATCCGTT 
GCGGCGGAGATCCTGCGGAAATCCAGGAAGTTCCCCTCAGCGCTGCCCATATGGGCACGCCCAGACTACA 
ACCCTCCACTGCTAGAGTCCTGGAAGGACCCAGATTATGTCCCTCCGGTGGTACACGGGTGCCCGTTGCC 
GCCTACCACGGCCCCTCCAGTACCACCTCCACGGAGAAAAAGGACGGTCGTCCTAACAGAGTCATCCGTG 

55 TCTTCTGCCTTGGCGGAGCTCGCTACTAAGACCTTCGGCAGCTCTGAATCGTCGGCCGTCGACAGCGGCA 
CGGCGACTGCCCCTCCTGACGAGGCCTCCGGCGGCGGCGACAAAGGATCCGACGTTGAGTCGTACTCCTC 
CATGCCCCCCCTTGAGGGAGAGCCGGGGGACCCCGACCTCAGCGACGGGTCCTGGTCTACCGTGAGTGAG 
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GAGGCCAGTGAGGACGTCGTCTGCTGCTCAATGTCCTATACATGGACAGGCGCCTTGATCACGCCATGTG 
CTGCGGAGGAGAGCAAGCTGCCCATCAACCCGCTGAGCAACTCCTTGCTGCGTCACCACAACATGGTCTA 
TGCTACAACATCCCGCAGTGCAAGCCTACGGCAGAAGAAGGTCGCTTTTGACAGAATGCAAGTCCTGGAC 
GACCACTACCGGGACGTGCTCAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTAAACTCCTATCCA 
TAGAAGAGGCCTGCAAGCTGACGCCCCCACATTCAGCCAAATCCAAATTTGGCTATGGGGCAAAAGACGT 
CCGGAACCTATCCAGCAAGGCCGTTAACCACATCCGCTCCGTGTGGAAGGACTTGTTGGAAGACAATGAG 
ACACCAATCAATACCACCATCATGGCAAAAAATGAGGTTTTCTGCGTCCAACCAGAGAAAGGAGGCCGTA 
AGCCAGCTCGCCTTATCGTATTCCCAGACTTGGGAGTCCGTGTGTGCGAGAAGATGGCCCTTTATGACGT 
GGTCTCCACCCTTCCTCAGCCCGTGATGGGCTCCTCATACGGATTCCAGTACTCTCCTGGGCAGCGGGTC 
GAATTCCTGCTAAATGCCTGGAAATCAAAGGAAAACCCTATGGGCTTCTCATATGACACCCGCTGTTTTG 
ACTCAACGGTCACTCAGAACGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGACTTGGCCCCCGA 
GGCCAGACGGGCCATAAAGTCGCTCACAGAGCGGCTCTATATCGGGGGTCCCCTGACTAATTCAAAAGGG 
CAGAACTGCGGTTATCGCCGGTGCCGCGCAAGTGGCGTGCTGACGACCAGCTGCGGTAATACCCTTACAT 
GTTACTTGAAGGCCTCTGCGGCCTGTCGAGCTGCGAAGCTGCAGGACTGCACGATGCTCGTGAACGGAGA 
CGACCTTGTCGTTATCTGTGAAAGCGCGGGAACTCAAGAGGATGCGGCGAGCCTACGAGTCTTCACGGAG 
GCTATGACTAGGTACTCTGCCCCCCCTGGGGACCTGCCCCAACCAGAATACGACTTGGAGCTAATAACAT 
CATGCTCCTCCAATGTGTCAGTCGCCCACGATGCATCTGGCAAAAGGGTGTACTACCTCACCCGTGACCC 
CACCATCCCCCTCGCGCGGGCTGCGTGGGAGACAGCTAGACACACTCCAGTCAACTCCTGGCTAGGCAAC 
ATCATCATGTATGCGCCCACTCTATGGGCAAGGATGATTCTGATGACTCACTTCTTCTCCATCCTTCTAG 
CTCAGGAGCAACTTGAGAAAGCCCTGGATTGCCAAATCTACGGGGCCTACTACTCCATTGAGCCACTTGA 
CCTACCTCAGATCATTGAACGACTCCATGGCCTTAGCGCATTTTCACTCCATAGTTACTCTCCAGGTGAG 
ATCAATAGGGTGGCGTCATGTCTCAGGAAACTTGGGGTACCACCCTTGCGAGTCTGGAGACATCGGGCCA 
GAAGCGTCCGCGCTAAGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTACCTCTTCAACTG 
GGCAGTAAAGACCAAGCTTAAACTCACTCCAATCCCGGCTGCGTCCCGGTTGGACTTGTCCGGCTGGTTC 
GTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGTGCCCGACCCCGTTGGTTCATGTTGT 
GCCTACTCCTACTTTCTGTAGGGGTAGGCATCTACCTGCTCCCCAACCGATGAACGGGGAGATAAACACT 
CCAGGCCAATAGGCCATCCC (SEQ ID NO: 6690) 



gi 1 15422182 | gb | AY051292.il Hepatitis C virus from India polyprotein mRNA, 
complete cds 

GCCAGCCCCCTGATGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCA 
GAAAGCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCA 
TAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCCG 
CTCAATGCCTGGAGATTTGGGCGTGCCCCCGCAAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCC 
TTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCACG 
AATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGACGCCCACAGAACGTTAAGTTCCCGGGTG 
GCGGCCAGATCGTTGGCGGAGTTTGCTTGTTGCCGCGCAGGGGTCCCAGAGTGGGTGTGCGCGCGACGAG 
GAAGACTTCCGAGCGGTCACAACCTCGCGGAAGGCGTCAGCCTATTCCCAAGGCCCGCCGACCCGAGGGC 
AGGTCCTGGGCGCAGCCCGGGTACCCTTGGCCCCTCTATGGCAACGAGGGCTGTGGGTGGGCAGGATGGC 
TCTTGTCCCCCCGCGGCTCCCGGCCTAGTCGGGGCCCCTCTGACCCCCGGCGCAGGTCACGCAATTTGGG 
TAAGGTCATCGATACCCTCACGTGTGGCTTCGCCGACCTCATGGGGTACATCCCGCTCGTCGGTGCTCCT 
CTAGGGGGCGCTGCTAGGGCTCTGGCACATGGTGTTAGGGTTCTAGAAGACGGCGTAAATTACGCAACAG 
GGAACCTTCCTGGTTGCTCTTTTTCTATCTTCTTGCTTGCTCTTCTCTCCTGCTTGACAGTCCCTGCTTC 
GGCCGTCGAAGTGCGCAACTCTTCGGGGATCTACCATGTCACCAATGATTGCCCCAATGCGTCTGTTGTG 
TACGAGACAGATAGCTTGATCATACATCTGCCCGGGTGTGTGCCCTGCGTACGCGAGGGCAACGCTTCGA 
GGTGCTGGGTCTCCCTTAGTCCTACTGTTGCCGCTAAGGATCCGGGCGTCCCCGTCAACGAGATTCGGCG 
TCACGTCGACCTGATTGTCGGGGCCGCTGCATTCTGTTCGGCTATGTATGTAGGGGACTTATGCGGTTCC 
ATCTTCCTCGTTGGCCAGCTTTTCACCCTCTCCCCTAGGCGCCACTGGACAACACAAGACTGTAATTGCT 
CCATCTACCCAGGACATGTGACAGGCCATCGAATGGCTTGGGACATGATGATGAATTGGTCACCTACTGG 
CGCTTTGGTGGTAGCGCAGCTACTCCGGATCCCACAAGCCGTCTTGGATATGATAGCCGGTGCCCACTGG 
GGTGTCCTAGCGGGCCCGGCATACTACTCCATGGTGGGGAACTGGGCTAAGGTTTTGGTTGTGCTACTGC 
TCTTCGCTGGCGTCGATGCAACCACCCAAGTCACAGGTGGCACCGCGGGCCGTAATGCATATAGATTGGC 
TAGCCTCTTCTCCACCGGCCCCAGCCAAAATATCCAGCTCATAAACTCCAATGGCAGCTGGCACATTAAC 
AGGACTGCCCTGAATTGCAATGACAGCCTGCACACCGGCTGGGTAGCAGCGCTGTTCTACTCCCACAAGT 

347 



WO 2004/080406 



PCT/US2004/007070 



TCAACTCTTCGGGGCGTCCTGAGAGGATGGCTAGTTGTCGGCCTCTTACCGCCTTCGACCAAGGGTGGGG 
GCCCATCACTTACGGGGGGAAAGCTAGTAACGACCAGCGGCCGTATTGCTGGCACTATGCCCCACGCCCG 
TGCGGTATCGTGCCGGCGAAAGAGGTTTGCGGGCCTGTATACTGTTTCACACCCAGTCCCGTGGTAGTGG 
GGACGACGGACAAGTACGGCGTTCCTACCTACACATGGGGCGAGAATGAGACGGATGTACTGCTCCTTAA 
5 CAACTCTAGGCCGCCAATAGGGAATTGGTTCGGGTGTACGTGGATGAATTCCACTGGTTTCACCAAGACG 
TGCGGGGCTCCTGCCTGTAACGTCGGCGGGAGCGAGACCAACACCCTGTCGTGCCCCACAGATTGCTTCC 
GCAGACATCCGGACGCAACATACGCTAAGTGCGGCTCTGGCCCTTGGCTTAACCCTCGATGCATGGTGGA 
CTACCCTTACAGGCTCTGGCACTATCCCTGCACAGTCAATTACACCATATTCAAGATCAGGATGTTCGTG 
GGCGGGATTGAGCACAGGCTCACCGCCGCGTGCAACTGGACGCGGGGAGAGCGCTGCGACTTGGACGACA 
10 GGGATCGTGCCGAGTTGAGCCCGCTGTTGCTGTCCACCACGCAATGGCAGGTCCTCCCCTGCTCATTCAC 
AACGCTGCCCGCCCTGTCAACTGGCCTAATACATCTCCACCAGAACATCGTGGACGTGCAGTACCTCTAC 
GGGTTGAGCTCGGTAGTTACATCCTGGGCCATAAGGTGGGAGTATGTCGTGCTCCTTTTCTTGCTGTTAG 
CAGATGCCCGCATTTGTGCCTGCCTTTGGATGATGCTTCTCATATCCCAGGTAGAGGCGGCGCTGGAGAA 
CCTGATAGTCCTCAACGCTGCTTCCCTGGCTGGGACACACGGCATCGTCCCTTTCTTCATCTTTTTTTGT 
15 GCAGCCTGGTATCTGAAAGGCAAGTGGGCCCCTGGACTCGTCTACTCCGTCTACGGAATGTGGCCGCTGC 
TCCTGCTTCTCCTGGCGTTGCCCCAACGGGCGTACGCCTTGGATCAGGAGTTGGCCGCGTCGTGTGGGGC 
CGTGGTCTTCATCAGCCTAGCGGTACTTACCCTGTCGCCGTACTACAAACAGTACATGGCCCGCGGCATC 
TGGTGGCTGCAGTACATGCTGACCAGAGCGGAGGCGCTCCTGCACGTCTGGGTCCCCTCGCTCAACGCCC 
GGGGAGGGCGTGATGGTGCCATACTGCTCATGTGTGTGCTCCACCCGCACTTGCTCTTTGACATCACCAA 
20 AATCATGCTGGCCATTCTCGGGCCCCTGTGGATCTTGCAGGCCAGTCTGCTCAGGGTGCCGTACTTCGTG 
CGCGCCCACGGTCTCATTAGGCTCTGCATGCTGGTGCGCAAAACAGCGGGCGGTCACTATGTGCAGATGG 
CTCTGTTGAAGCTGGGGGCACTTACTGGCACTTACATTTACAACCACCTTTCCCCACTCCAAGACTGGGC 
TCATGGCAGCTTGCGTGATCTAGCGGTGGCCACCGAGCCCGTCATCTTCTCCCGGATGGAGATCAAGACT 
ATCACCTGGGGGGCAGACACCGCGGCCTGTGGAGACATCATCAACGGGCTGCCTGTTTCTGCTCGGAGGG 
25 GGAGAGAGGTGTTGTTGGGACCAGCCGATGCCCTGACTGACAAGGGATGGAGGCTTTTAGCCCCCATCAC 
AGCTTACGCCCAACAGACACGAGGTCTCTTGGGCTGTATTGTCACCAGCCTCACCGGTCGGGACAAAAAT 
CAAGTGGAGGGGGAAATCCAGATTGTGTCTACCGCAACCCAGACGTTCTTGGCCACTTGCATCAACGGAG 
CTTGCTGGACTGTTTATCATGGGGCCGGATCGAGGACCATCGCTTCGGCGTCGGGTCCTGTGGTCCGGAT 
GTACACCAATGTGGACCAGGATTTGGTGGGCTGGCCAGCGCCTCAGGGAGCGCGCTCCCTGACGCCGTGC 
30 ACGTGCGGTGCCTCGGATCTGTACTTGGTCACGAGGCACGCGGATGTCATCCCAGTGCGGCGTCGAGGCG 
ATAACAGGGGAAGCTTGCTTTCTCCCCGGCCCATCTCATACCTAAAAGGATCCTCGGGAGGCCCTCTGCT 
CTGCCCCATGGGACATGTCGCGGGCATTTTTAGGGCCGCGGTGTGCACCCGTGGGGTTGCAAAGGCGGTC 
GACTTTGTGCCCGTTGAGTCCTTAGAGACCACCATGAGGTCCCCAGTGTTTACTGACAATTCCAGCCCTC 
CAACAGTGCCGCAGAGTTACCAGGTGGCACATCTACATGCACCCACTGGGAGTGGCAAGAGCACGAAGGT 
35 GCCGGCCGCTTACGCAGCTCAAGGGTACAAGGTACTTGTGCTGAACCCGTCTGTTGCTGCCACCTTAGGG 
TTCGGTGCTTATATGTCAAAGGCCCATGGGATTGACCCAAACGTCAGGACCGGCGTGAGGACCATTACCA 
CAGGCTCCCCCATCACCTACTCCACCTACGGGAAATTTTTGGCTGATGGCGGATGCCCAGGAGGTGCGTA 
CGACATCATAATATGTGACGAATGTCACTCAGTGGACGCCACCTCGATTCTGGGCATAGGGACCGTCTTG 
GACCAAGCGGAGACGGCGGGGGTTAGGCTGACTGTCCTTGCCACCGCTACACCACCTGGCTTGGTCACCG 
40 TGCCACATTCCAACATCGAGGAAGTTGCAGTGTCCGCTGACGGGGAGAAACCATTTTATGGTAAGGCCAT 
CCCCCTAAACTACATCAAGGGGGGGAGGCATCTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAGCTC 
GCTGCAAAGCTGGTCGGTCTGGGCGTCAACGCGGTGGCCTTTTACCGTGGCCTCGACGTATCTGTCATTC 
CAACTACAGGAGACGTCGTTGTTGTAGCGACCGACGCCTTGATGACTGGCTTCACCGGCGATTTCGACTC 
TGTGATAGACTGCAACACCTGTGTCGTCCAGACAGTCGACTTCAGCCTAGACCCTATATTCTCTATTGAG 
45 ACTTCCACCGTGCCCCAGGACGCCGTGTCCCGCTCCCAACGGAGGGGTAGGACCGGTCGAGGGAAGCATG 
GTATTTACAGATATGTGTCACCCGGGGAGCGGCCGTCTGGCATGTTCGACTCCGTGGTCCTCTGTGAGTG 
CTATGACGCGGGTTGTGCTTGGTACGAGCTTACACCCGCCGAGACCACAGTCAGGCTACGGGCATACCTT 
AACACCCCAGGATTGCCCGTGTGCCAGGACCACTTGGAGTTCTGGGAGAGTGTCTTCACCGGCCTCACCC 
ACATAGATGCCCACTTCCTGTCCCAGACGAAACAGAGTGGGGAGAACTTCCCCTACCTAGTCGCATACCA 
50 AGCCACCGTGTGCGCTAGAGCTAGAGCTCCTCCCCCGTCATGGGACCAAATGTGGAAGTGCCTGATACGG 
CTCAAGCCCACCCTCACTGGGGCTACCCCATTACTATACAGACTGGGTAGTGTACAGAATGAGATCACCT 
TAACACACCCAATCACCCAATACATCATGGCTTGCATGTCGGCGGACCTGGAGGTCGTCACTAGCACGTG 
GGTGTTGGTGGGCGGCGTCCTAGCCGCTTTGGCCGCTTACTGCCTGTCCACAGGCAGCGTGGTCATAGTG 
GGCAGGATAATCCTAGGTGGGAAGCCGGCAGTCATACCTGACAGGGAGGTTCTCTACCGAGAGTTTGATG 
55 AGATGGAGGAGTGCGCCGCCCACGTCCCCTACCTCGAGCAGGGGATGCATTTGGCTGGACAGTTCAAGCA 
GAAAGCTCTCGGGTTGCTCCAGACAGCATCCAAGCAAGCGGAGACGATCACTCCCACTGTCCGCACCAAC 
TGGCAGAAACTCGAGTCCTTCTGGGCTAAGCACATGTGGAACTTCGTTAGCGGGATACAATACCTGGCGG 
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GCCTGTCAACGCTGCCCGGGAACCCCGCTATAGCGTCGCTGATGTCGTTTACGGCCGCGGTGACGAGTCC 
ACTAACCACCCAGCAAACCCTCTTCTTTAACATCTTAGGGGGGTGGGTGGCGGCCCAGCTTGCTTCCCCA 
GCTGCCGCTACTGCTTTTGTCGGTGCTGGTATTACTGGCGCCGTTGTTGGCAGTGTGGGCCTAGGGAAGG 
TCCTAGTGGACATTATTGCTGGCTACGGGGCTGGTGTGGCGGGGGCCCTCGTGGCTTTCAAAATCATGAG 
5 CGGGGAGACCCCCACCACCGAGGATCTAGTCAACCTTCTGCCTGCCATCCTATCGCCAGGAGCTCTCGTT 
GTCGGCGTGGTGTGCGCAGCAATACTACGCCGGCACGTGGGCCCTGGCGAGGGCGCCGTGCAGTGGATGA 
ACCGGCTGATAGCGTTTGCTTCTCGGGGTAACCACGTCTCCCCTACACACTACGTGCCGGAGAGCGACGC 
GTCGGCTCGTGTCACACAAATTCTCACCAGCCTCACTGTTACTCAGCTTCTGAAAAGGCTCCACGTGTGG 
ATAAGCTCGGATTGCATCGCCCCGTGTGCTAGTTCTTGGCTTAAAGATGTCTGGGACTGGATATGCGAGG 

10 TGCTGAGCGACTTCAAGAATTGGCTGAAGGCCAAACTTGTACCACAACTGCCCGGGATCCCATTCGTATC 
CTGCCAACGCGGGTACCGTGGGGTCTGGCGGGGCGAGGGCATCGTGCACACTCGTTGCCCGTGTGGGGCC 
AATATAACTGGACATGTCAAGAACGGTTCGATGAGAATCGTCGGGCCTAAGACTTGCAGCAACACCTGGC 
GTGGGTCGTTCCCCATTAACGCTTACACTACAGGCCCGTGCACGCCCTCCCCGGCGCCGAACTATACGTT 
CGCGCTATGGAGGGTGTCTGCAGAGGAGTATGTGGAGGTAAGGCGGCTGGGGGACTTCCATTACGTCACG 

15 GGGGTGACCACTGATAAACTCAAGTGTCCATGCCAGGTCCCCTCACCCGAGTTCTTCACAGAGGTGGACG 
GGGTGCGCCTGCATAGGTACGCCCCCCCCTGCAAACCCCTGCTGCGAGAAGAGGTGACGTTTAGCATCGG 
GCTCAATGAATACTTGGTGGGGTCCCAGTTGCCCTGCGAGCCCGAGCCAGACGTAGCTGTACTGACATCA 
ATGCTTACAGACCCCTCCCACATCACTGCAGAGACGGCAGCGCGTAGGCTGAAGCGGGGGTCTCCCCCCT 
CCCTGGCCAGCTCTTCCGCCAGCCAGCTGTCCGCGCCGTCACTGAAGGCAACATGCACCACTCACCACGA 

20 CTCTCCAGACGCTGACCTCATAGAAGCCAACCTCCTGTGGAGACAGGAGATGGGGGGGAACATCACTAGG 
GTGGAGTCGGAGAACAAGATTGTCGTTCTGGATTCTTTCGACCCGCTCGTAGCGGAGGAGGATGATCGGG 
AGATCTCTATTCCAGCTGAGATTCTGCGGAAGTTCAAGCAGTTTCCTCCCGCTATGCCCATATGGGCACG 
GCCAGATTATAATCCTCCCCTTGTGGAACCGTGGAAGCGCCCGGACTATGAGCCACCCTTAGTCCACGGG 
TGCCCCCTACCACCTCCCAAGCCAACTCCGGTGCCGCCACCCCGGAGAAAGAGGACGGTGGTGCTGGACG 

25 AGTCTACAGTATCATCTGCTCTGGCTGAGCTTGCCACTAAGACCTTCGGCAGCTCTACAACCTCAGGCGT 
GACAAGTGGTGAAGGGACTGAATCGTCCCCGGCGCCCTCCTGCGGCGGTGAGCTGGACTCCGAAGCTGAA 
TCTTACTCCTCCATGCCCCCTCTCGAGGGGGAGCCGGGGGACCCCGATCTCAGCGACGGGTCTTGGTCTA 
CCGTGAGCAGTGATGGTGGCACGGAAGACGTTGTGTGCTGCTCGATGTCTTACTCGTGGACGGGCGCTTT 
AATCACGCCCTGTGCCTCAGAGGAAGCCAAGCTCCCTATCAACGCATTGAGCAACTCGCTGCTGCGCCAC 

30 CACAACTTGGTGTATTCCACCACCTCTCGCAGCGCTGGCCAGAGACAGAAAAAAGTCACATTTGACAGAG 
TGCAAGTCCTGGACGACCATTACCGGGACGTGCTCAAGGAGGCTAAGGCCAAGGCATCCACGGTGAAGGC 
TAGACTGCTATCCGTTGAGGAAGCGTGTAGCCTGACGCCCCCACACTCCGCCAGATCAAAATTTGGCTAT 
GGGGCGAAGGATGTCCGAAGCCATTCCAGTAAGGCTATACGCCACATCAACTCCGTGTGGCAGGACCTTC 
TGGAGGACAATACAACACCCATAGACACTACCATCATGGCAAAGAATGAGGTCTTCTGTGTGAAGCCCGA 

35 AAAGGGGGGCCGCAAGCCCGCTCGTCTTATCGTGTACCCCGACCTGGGAGTGCGCGTATGCGAGAAGAGG 
GCTTTGTATGACGTAGTCAAACAGCTCCCCATTGCCGTGATGGGAGCCTCCTACGGGTTCCAGTACTCAC 
CAGCGCAGCGGGTCGACTTCCTGCTTAAAGCGTGGAAATCTAAGAAAGTCCCCATGGGGTTTTCCTATGA 
CACCCGTTGCTTTGACTCAACAGTCACTGAGGCTGATATCCGTACGGAGGAAGACCTCTACCAATCTTGT 
GACCTGGCCCCTGAGGCTCGCATAGCCATAAGGTCCCTCACAGAGAGGCTTTACATCGGGGGCCCACTCA 

40 CCAATTCTAAGGGACAAAACTGCGGCTATCGGCGATGCCGCGCAAGCGGCGTGCTGACCACTAGCTGCGG 
TAACACCATAACCTGCTTCCTCAAAGCCAGTGCAGCCTGTCGAGCTGCGAAGCTCCAGGACTGCACCATG 
CTCGTGTGCGGCGACGACCTCGTCGTTATCTGTGAGAGCGCCGGTGTCCAGGAGGACGCTGCGAGCCTGA 
GAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCGGGAGACCCGCCTCAACCAGAATACGACTT 
GGAGCTTATAACATCCTGCTCCTCCAATGTGTCGGTCGCGCGCGACGGCGCTGGCAAAAGGGTCTATTAT 

45 CTGACCCGTGACCCTGAGACTCCCCTCGCGCGTGCCGCTTGGGAGACAGCAAGACACACTCCAGTGAACT 
CCTGGCTAGGCAACATCATCATGTTTGCCCCCACTCTGTGGGTACGGATGGTCCTCATGACCCATTTTTT 
CTCCATACTCATAGCTCAGGAGCACCTTGGAAAGGCTCTAGATTGTGAAATCTATGGAGCCGTACACTCC 
GTCCAACCGTTGGACTTACCTGAAATCATCCAAAGACTCCACAGCCTCAGCGCGTTTTCGCTCCACAGTT 
ACTCTCCAGGTGAAATCAATAGGGTGGCTGCATGCCTCAGGAAGCTTGGGGTTCCGCCCTTGCGAGCTTG 

50 GAGACACCGGGCCCGGAGCGTTCGCGCCACACTCCTATCCCAGGGGGGGAAAGCCGCTATATGCGGTAAG 
TACCTCTTCAACTGGGCGGTGAAAACCAAACTCAAACTCACTCCATTACCGTCCATGTCTCAGTTGGACT 
TGTCCAACTGGTTCACGGGCGGTTACAGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCG 
TTTGTTCCTCTGGTGCCTACTCCTACTTTCAGTAGGGGTAGGCATCTATCTCCTTCCCAACCGATAGACG 
GNTGGGCAACCACTCCGGGTCTTTAGGCCCTATTTAAACACTCCAGGCCTTTAGGCCCCGT 

55 (SEQ ID NO: 6691) 
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gi | 23510419 1 ref 1NM_000043 . 3 | Homo sapiens tumor necrosis factor receptor 

superfamily, member 6 (TNFRSF6), transcript variant 1, mRNA 

CCTACCCGCGCGCAGGCCAAGTTGCTGAATCAATGGAGCCCTCCCCAACCCGGGCGTTCCCCAGCGAGGC 

TTCCTTCCCATCCTCCTGACCACCGGGGCTTTTCGTGAGCTCGTCTCTGATCTCGCGCAAGAGTGACACA 

CAGGTGTTCAAAGACGCTTCTGGGGAGTGAGGGAAGCGGTTTACGAGTGACTTGGCTGGAGCCTCAGGGG 

CGGGCACTGGCACGGAACACACCCTGAGGCCAGCCCTGGCTGCCCAGGCGGAGCTGCCTCTTCTCCCGCG 

GGTTGGTGGACCCGCTCAGTACGGAGTTGGGGAAGCTCTTTCACTTCGGAGGATTGCTCAACAACCATGC 

TGGGCATCTGGACCCTCCTACCTCTGGTTCTTACGTCTGTTGCTAGATTATCGTCCAAAAGTGTTAATGC 

CCAAGTGACTGACATCAACTCCAAGGGATTGGAATTGAGGAAGACTGTTACTACAGTTGAGACTCAGAAC 

TTGGAAGGCCTGCATCATGATGGCCAATTCTGCCATAAGCCCTGTCCTCCAGGTGAAAGGAAAGCTAGGG 

ACTGCACAGTCAATGGGGATGAACCAGACTGCGTGCCCTGCCAAGAAGGGAAGGAGTACACAGACAAAGC 

CCATTTTTCTTCCAAATGCAGAAGATGTAGATTGTGTGATGAAGGACATGGCTTAGAAGTGGAAATAAAC 

TGCACCCGGACCCAGAATACCAAGTGCAGATGTAAACCAAACTTTTTTTGTAACTCTACTGTATGTGAAC 

ACTGTGACCCTTGCACCAAATGTGAACATGGAATCATCAAGGAATGCACACTCACCAGCAACACCAAGTG 

CAAAGAGGAAGGATCCAGATCTAACTTGGGGTGGCTTTGTCTTCTTCTTTTGCCAATTCCACTAATTGTT 

TGGGTGAAGAGAAAGGAAGTACAGAAAACATGCAGAAAGCACAGAAAGGAAAACCAAGGTTCTCATGAAT 

CTCCAACCTTAAATCCTGAAACAGTGGCAATAAATTTATCTGATGTTGACTTGAGTAAATATATCACCAC 

TATTGCTGGAGTCATGACACTAAGTCAAGTTARAGGCTTTGTTCGAAAGAATGGTGTCAATGAAGCCAAA 

ATAGATGAGATCAAGAATGACAATGTCCAAGACACAGCAGAACAGAAAGTTCAACTGCTTCGTAATTGGC 

ATCAACTTCATGGAAAGAAAGAAGCGTATGACACATTGATTAAAGATCTCAAAAAAGCCAATCTTTGTAC 

TCTTGCAGAGAAAATTCAGACTATCATCCTCAAGGACATTACTAGTGACTCAGAAAATTCAAACTTCAGA 

AATGAAATCCAAAGCTTGGTCTAGAGTGAAAAACAACAAATTCAGTTCTGAGTATATGCAATTAGTGTTT 

GAAAAGATTCTTAATAGCTGGCTGTAAATACTGCTTGGTTTTTTACTGGGTACATTTTATCATTTATTAG 

CGCTGAAGAGCCAACATATTTGTAGATTTTTAATATCTCATGATTCTGCCTCCAAGGATGTTTAAAATCT 

AGTTGGGAAAACAAACTTCATCAAGAGTAAATGCAGTGGCATGCTAAGTACCCAAATAGGAGTGTATGCA 

GAGGATGAAAGATTAAGATTATGCTCTGGCATCTAACATATGATTCTGTAGTATGAATGTAATCAGTGTA 

TGTTAGTACAAATGTCTATCCACAGGCTAACCCCACTCTATGAATCAATAGAAGAAGCTATGACCTTTTG 

CTGAAATATCAGTTACTGAACAGGCAGGCCACTTTGCCTCTAAATTACCTCTGATAATTCTAGAGATTTT 

ACCATATTTCTAAACTTTGTTTATAACTCTGAGAAGATCATATTTATGTAAAGTATATGTATTTGAGTGC 

AGAATTTAAATAAGGCTCTACCTCAAAGACCTTTGCACAGTTTATTGGTGTCATATTATACAATATTTCA 

ATTGTGAATTCACATAGAAAACATTAAATTATAATGTTTGACTATTATATATGTGTATGCATTTTACTGG 

CTCAAAACTACCTACTTCTTTCTCAGGCATCAAAAGCATTTTGAGCAGGAGAGTATTACTAGAGCTTTGC 

CACCTCTCCATTTTTGCCTTGGTGCTCATCTTAATGGCCTAATGCACCCCCAAACATGGAAATATCACCA 

AAAAATACTTAATAGTCCACCAAAAGGCAAGACTGCCCTTAGAAATTCTAGCCTGGTTTGGAGATACTAA 

CTGCTCTCAGAGAAAGTAGCTTTGTGACATGTCATGAACCCATGTTTGCAATCAAAGATGATAAAATAGA 

TTCTTATTTTTCCCCCACCCCCGAAAATGTTCAATAATGTCCCATGTAAAACCTGCTACAAATGGCAGCT 

TATACATAGCAATGGTAAAATCATCATCTGGATTTAGGAATTGCTCTTGTCATACCCCCAAGTTTCTAAG 

ATTTAAGATTCTCCTTACTACTATCCTACGTTTAAATATCTTTGAAAGTTTGTATTAAATGTGAATTTTA 

AGAAATAATATTTATATTTCTGTAAATGTAAACTGTGAAGATAGTTATAAACTGAAGCAGATACCTGGAA 

CCACCTAAAGAACTTCCATTTATGGAGGATTTTTTTGCCCCTTGTGTTTGGAATTATAAAATATAGGTAA 

AAGTACGTAATTAAATAATGTTTTTGGTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

AAAAAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 6692) 



gi I 35910 | emb 1X12387.1 1 HSRCYP3 Human mRNA for cytochrome P-450 (cyp3 locus) 

GAATTCCCAAAGAGCAACACAGAGCTGAAAGGAAGACTCAGAGGAGAGAGATAAGTAAGGAAAGTAGTGA 

TGGCTCTCATCCCAGACTTGGCCATGGAAACCTGGCTTCTCCTGGCTGTCAGCCTGGTGCTCCTCTATCT 

ATATGGAACCCATTCACATGGACTTTTTAAGAAGCTTGGAATTCCAGGGCCCACACCTCTGCCTTTTTTG 

GGAAATATTTTGTCCTACCATAAGGGCTTTTGTATGTTTGACATGGAATGTCATAAAAAGTATGGAAAAG 

TGTGGGGCTTTTATGATGGTCAACAGCCTGTGCTGGCTATCACAGATCCTGACATGATCAAAACAGTGCT 

AGTGAAAGAATGTTATTCTGTCTTCACAAACCGGAGGCCTTTTGGTCCAGTGGGATTTATGAAAAGTGCC 

ATCTCTATAGCTGAGGATGAAGAATGGAAGAGATTACGATCATTGCTGTCTCCAACCTTCACCAGTGGAA 

AACTCAAGGAGATGGTCCCTATCATTGCCCAGTATGGAGATGTGTTGGTGAGAAATCTGAGGCGGGAAGC 

AGAGACAGGCAAGCCTGTCACCTTGAAAGACGTCTTTGGGGCCTACAGCATGGATGTGATCACTAGCACA 

TCATTTGGAGTGAACATCGACTCTCTCAACAATCCACAAGACCCCTTTGTGGAAAACACCAAGAAGCTTT 
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TAAGATTTGATTTTTTGGATCCATTCTTTCTCTCAATAACAGTCTTTCCATTCCTCATCCCAATTCTTGA 
AGTATTAAATATCTGTGTGTTTCCAAGAGAAGTTACAAATTTTTTAAGAAAATCTGTAAAAAGGATGAAA 
GAAAGTCGCCTCGAAGATACACAAAAGCACCGAGTGGATTTCCTTCAGCTGATGATTGACTCTCAGAATT 
CAAAAGAAACTGAGTCCCACAAAGCTCTGTCCGATCTGGAGCTCGTGGCCCAATCAATTATCTTTATTTT 
5 TGCTGGCTATGAAACCACGAGCAGTGTTCTCTCCTTCATTATGTATGAACTGGCCACTCACCCTGATGTC 
CAGCAGAAACTGCAGGAGGAftATTGATGCAGTTTTACCCAATAAGGCACCACCCACCTATGATACTGTGC 
TACAGATGGAGTATCTTGACATGGTGGTGAATGAAACGCTCAGATTATTCCCAATTGCTATGAGACTTGA 
GAGGGTCTGCAAAAAAGATGTTGAGATCAATGGGATGTTCATTCCCAAAGGGTGGGTGGTGATGATTCCA 
AGCTATGCTCTTCACCGTGACCCAAAGTACTGGACAGAGCCTGAGAAGTTCCTCCCTGAAAGATTCAGCA 

10 AGAAGAACAAGGACAACATAGATCCTTACATATACACACCCTTTGGAAGTGGACCCAGAAACTGCATTGG 
CATGAGGTTTGCTCTCATGAACATGAAACTTGCTCTAATCAGAGTCCTTCAGAACTTCTCCTTCAAACCT 
TGTAAAGAAACACAGATCCCCCTGAAATTAAGCTTAGGAGGACTTCTTCAACCAGAAAAACCCGTTGTTC 
TAAAGGTTGAGTCAAGGGATGGCACCGTAAGTGGAGCCTGAATTTTCCTAAGGACTTCTGCTTTGCTCTT 
CAAGAAATCTGTGCCTGAGAACACCAGAGACCTCAAATTACTTTGTGAATAGAACTCTGAAATGAAGATG 

15 GGCTTCATCCAATGGACTGCATAAATAACCGGGGATTCTGTACATGCATTGAGCTCTCTCATTGTCTGTG 
TAGAGTGTTATACTTGGGAATATAAAGGAGGTGACCAAATCAGTGTGAGGAGGTAGATTTGGCTCCTCTG 
CTTCTCACGGGACTATTTCCACCACCCCCAGTTAGCACCATTAACTCCTCCTGAGCTCTGATAAGAGAAT 
CAACATTTCTCAATAATTTCCTCCACAAATTATTAATGAAAATAAGAATTATTTTGATGGCTCTAACAAT 
GACATTTATATCACATGTTTTCTCTGGAGTATTCTATAGTTTTATGTTAAATCAATAAAGACCACTTTAC 

20 AAAAGTATTATCAGATGCTTTCCTGCACATTAAGGAGAATCTATAGAACTGAATGAGAACCAACAAGTAA 
ATATTTTTGGTCATTGTAATCACTGTTGGCGTGGGGCCTTTGTCAGAACTAGAATTTGATTATTAACATA 
GGTGAAAGTTAATCCACTGTGACTTTGCCCATTGTTTAGAAAGAATATTCATAGTTTAATTATGCCTTTT 
TTGATCAGGCACATGGCTCACGCCTGTAATCCTAGCAGTTTGGGAGGCTGAGCCGGGTGGATCGCCTGAG 
GTCAGGAGTTCAAGACAAGCCTGGCCTACATGGTGAAACCCCATCTCTACTAAAAATACACAAATTAGCT 

25 AGGCATGGTGGACTCGCCTGTAATCTCACTACACAGGAGGCTGAGGCAGGAGAATCACTTGAACCTGGGA 
GGCGGATGTTGAAGTGAGCTGAGATTGCACCACTGCACTCCAGTCTGGGTGAGAGTGAGACTCAGTCTTA 
AAAAAATATGCCTTTTTGAAGCACGTACATTTTGTAACAAAGAACTGAAGCTCTTATTATATTATTAGTT 
TTGATTTAATGTTTTCAGCCCATCTCCTTTCATATTTCTGGGAGACAGAAAACATGTTTCCCTACACCTC 
TTGCTTCCATCCTCAACACCCAACTGTCTCGATGCAATGAACACTTAATAAAAAACAGTCGATTGGTCAA 

30 AAAAAAAAAAAAAAAAAAAAAAAGAATTC (SEQ ID NO: 6693) 



gi|33954 9|gb|M19154.1|HUMTGFB2A Human transforming growth f actor-beta-2 
mRNA, complete cds 

35 GCCCCTCCCGTCAGTTCGCCAGCTGCCAGCCCCGGGACCTTTTCATCTCTTCCCTTTTGGCCGGAGGAGC 
CGAGTTCAGATCCGCCACTCCGCACCCGAGACTGACACACTGAACTCCACTTCCTCCTCTTAAATTTATT 
TCTACTTAATAGCCACTCGTCTCTTTTTTTCCCCATCTCATTGCTCCAAGAATTTTTTTCTTCTTACTCG 
CCAAAGTCAGGGTTCCCTCTGCCCGTCCCGTATTAATATTTCCACTTTTGGAACTACTGGCCTTTTCTTT 
TTAAAGGAATTCAAGCAGGATACGTTTTTCTGTTGGGCATTGACTAGATTGTTTGCAAAAGTTTCGCATC 

40 AAAAACAACAACAACAAAAAACCAAACAACTCTCCTTGATCTATACTTTGAGAATTGTTGATTTCTTTTT 
TTTATTCTGACTTTTAAAAACAACTTTTTTTTCCACTTTTTTAAAAAATGCACTACTGTGTGCTGAGCGC 
TTTTCTGATCCTGCATCTGGTCACGGTCGCGCTCAGCCTGTCTACCTGCAGCACACTCGATATGGACCAG 
TTCATGCGCAAGAGGATCGAGGCGATCCGCGGGCAGATCCTGAGCAAGCTGAAGCTCACCAGTCCCCCAG 
AAGACTATCCTGAGCCCGAGGAAGTCCCCCCGGAGGTGATTTCCATCTACAACAGCACCAGGGACTTGCT 

45 CCAGGAGAAGGCGAGCCGGAGGGCGGCCGCCTGCGAGCGCGAGAGGAGCGACGAAGAGTACTACGCCAAG 
GAGGTTTACAAAATAGACATGCCGCCCTTCTTCCCCTCCGAAACTGTCTGCCCAGTTGTTACAACACCCT 
CTGGCTCAGTGGGCAGCTTGTGCTCCAGACAGTCCCAGGTGCTCTGTGGGTACCTTGATGCCATCCCGCC 
CACTTTCTACAGACCCTACTTCAGAATTGTTCGATTTGACGTCTCAGCAATGGAGAAGAATGCTTCCAAT 
TTGGTGAAAGCAGAGTTCAGAGTCTTTCGTTTGCAGAACCCAAAAGCCAGAGTGCCTGAACAACGGATTG 

50 AGCTATATCAGATTCTCAAGTCCAAAGATTTAACATCTCCAACCCAGCGCTACATCGACAGCAAAGTTGT 
GAAAACAAGAGCAGAAGGCGAATGGCTCTCCTTCGATGTAACTGATGCTGTTCATGAATGGCTTCACCAT 
AAAGACAGGAACCTGGGATTTAAAATAAGCTTACACTGTCCCTGCTGCACTTTTGTACCATCTAATAATT 
ACATCATCCCAAATAAAAGTGAAGAACTAGAAGCAAGATTTGCAGGTATTGATGGCACCTCCACATATAC 
CAGTGGTGATCAGAAAACTATAAAGTCCACTAGGAAAAAAAACAGTGGGAAGACCCCACATCTCCTGCTA 

55 ATGTTATTGCCCTCCTACAGACTTGAGTCACAACAGACCAACCGGCGGAAGAAGCGTGCTTTGGATGCGG 
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CCTATTGCTTTAGAAATGTGCAGGATAATTGCTGCCTACGTCCACTTTACATTGATTTCAAGAGGGATCT 
AGGGTGGAAATGGATACACGAACCCAAAGGGTACAATGCCAACTTCTGTGCTGGAGCATGCCCGTATTTA 
TGGAGTTCAGACACTCAGCACAGCAGGGTCCTGAGCTTATATAATACCATAAATCCAGAAGCATCTGCTT 
CTCCTTGCTGCGTGTCCCAAGATTTAGAACCTCTAACCATTCTCTACTACATTGGCAAAACACCCAAGAT 
TGAACAGCTTTCTAATATGATTGTAAAGTCTTGCAAATGCAGCTAAAATTCTTGGAAAAGTGGCAAGACC 
AAAATGACAATGATGATGATAATGATGATGACGACGACAACGATGATGCTTGTAACAAGAAAACATAAGA 
GAGCCTTGGTTCATCAGTGTTAAAAAATTTTTGAAAAGGCGGTACTAGTTCAGACACTTTGGAAGTTTGT 
GTTCTGTTTGTTAAAACTGGCATCTGACACAAAAAAAGTTGAAGGCCTTATTCTACATTTCACCTACTTT 
GTAAGTGAGAGAGACAAGAAGCAAATTTTTTTTAAAGAAAAAAATAAACACTGGAAGAATTTATTAGTGT 
TAATTATGTGAACAACGACAACAACAACAACAACAACAAACAGGAAAATCCCATTAAGTGGAGTTGCTGT 
ACGTACCGTTCCTATCCCGCGCCTCACTTGATTTTTCTGTATTGCTATGCAATAGGCACCCTTCCCATTC 
TTACTCTTAGAGTTAACAGTGAGTTATTTATTGTGTGTTACTATATAATGAACGTTTCATTGCCCTTGGA 
AAATAAAACAGGTGTATAAAGTGGAGACCAAATACTTTGCCAGAAACTCATGGATGGCTTAAGGAACTTG 
AACTCAAACGAGCCAGAAAAAAAGAGGTCATATTAATGGGATGAAAACCCAAGTGAGTTATTATATGACC 
GAGAAAGTCTGCATTAAGATAAAGACCCTGAAAACACATGTTATGTATCAGCTGCCTAAGGAAGCTTCTT 
GT AAGGT CCAAAAAC T AAAAAGAC T G T TAATAAAAGAAACT T T C AGT C AG (SEQ ID NO: 6694) 



gi|186624|gb| J04111 . 1 1 HUMJUNA Human c-jun proto oncogene ( JUN) , complete 
cds, clone hCJ-1 

CCCGGGGAGGGGACCGGGGAACAGAGGGCCGAGAGGCGTGCGGCAGGGGGGAGGGTAGGAGAAAGAAGGG 
CCCGACTGTAGGAGGGCAGCGGAGCATTACCTCATCCCGTGAGCCTCCGCGGGCCCAGAGAAGAATCTTC 
TAGGGTGGAGTCTCCATGGTGACGGGCGGGCCCGCCCCCCTGAGAGCGACGCGAGCCAATGGGAAGGCCT 
TGGGGTGACATCATGGGCTATTTTTAGGGGTTGACTGGTAGCAGATAAGTGTTGAGCTCGGGCTGGATAA 
GGGCTCAGAGTTGCACTGAGTGTGGCTGAAGCAGCGAGGCGGGAGTGGAGGTGCGCGGAGTCAGGCAGAC 
AGACAGACACAGCCAGCCAGCCAGGTCGGCAGTATAGTCCGAACTGCAAATCTTATTTTCTTTTCACCTT 
CTCTCTAACTGCCCAGAGCTAGCGCCTGTGGCTCCCGGGCTGGTGGTTCGGGAGTGTCCAGAGAGCCTTG 
TCTCCAGCCGGCCCCGGGAGGAGAGCCCTGCTGCCCAGGCGCTGTTGACAGCGGCGGAAAGCAGCGGTAC 
CCCACGCGCCCGCCGGGGGACGTCGGCGAGCGGCTGCAGCAGCAAAGAACTTTCCCGGCGGGGAGGACCG 
GAGACAAGTGGCAGAGTCCCGGAGCGAACTTTTGCAAGCCTTTCCTGCGTCTTAGGCTTCTCCACGGCGG 
TAAAGACCAGAAGGCGGCGGAGAGCCACGCAAGAGAAGAAGGACGTGCGCTCAGCTTCGCTCGCACCGGT 
TGTTGAACTTGGGCGAGCGCGAGCCGCGGCTGCCGGGCGCCCCCTCCCCCTAGCAGCGGAGGAGGGGACA 
AGTCGTCGGAGTCCGGGCGGCCAAGACCCGCCGCCGGCCGGCCACTGCAGGGTCCGCACTGATCCGCTCC 
GCGGGGAGAGCCGCTGCTCTGGGAAGTGAGTTCGCCTGCGGACTCCGAGGAACCGCTGCGCCCGAAGAGC 
GCTCAGTGAGTGACCGCGACTTTTCAAAGCCGGGTAGCGCGCGCGAGTCGACAAGTAAGAGTGCGGGAGG 
CATCTTAATTAACCCTGCGCTCCCTGGAGCGAGCTGGTGAGGAGGGCGCAGCGGGGACGACAGCCAGCGG 
GTGCGTGCGCTCTTAGAGAAACTTTCCCTGTCAAAGGCTCCGGGGGGCGCGGGTGTCCCCCGCTTGCCAG 
AGCCCTGTTGCGGCCCCGAAACTTGTGCGCGCACGCCAAACTAACCTCACGTGAAGTGACGGACTGTTCT 
ATGACTGCAAAGATGGAAACGACCTTCTATGACGATGCCCTCAACGCCTCGTTCCTCCCGTCCGAGAGCG 
GACCTTATGGCTACAGTAACCCCAAGATCCTGAAACAGAGCATGACCCTGAACCTGGCCGACCCAGTGGG 
GAGCCTGAAGCCGCACCTCCGCGCCAAGAACTCGGACCTCCTCACCTCGCCCGACGTGGGGCTGCTCAAG 
CTGGCGTCGCCCGAGCTGGAGCGCCTGATAATCCAGTCCAGCAACGGGCACATCACCACCACGCCGACCC 
CCACCCAGTTCCTGTGCCCCAAGAACGTGACAGATGAGCAGGAGGGGTTCGCCGAGGGCTTCGTGCGCGC 
CCTGGCCGAACTGCACAGCCAGAACACGCTGCCCAGCGTCACGTCGGCGGCGCAGCCGGTCAACGGGGCA 
GGCATGGTGGCTCCCGCGGTAGCCTCGGTGGCAGGGGGCAGCGGCAGCGGCGGCTTCAGCGCCAGCCTGC 
ACAGCGAGCCGCCGGTCTACGCAAACCTCAGCAACTTCAACCCAGGCGCGCTGAGCAGCGGCGGCGGGGC 
GCCCTCCTACGGCGCGGCCGGCCTGGCCTTTCCCGCGCAACCCCAGCAGCAGCAGCAGCCGCCGCACCAC 
CTGCCCCAGCAGATGCCCGTGCAGCACCCGCGGCTGCAGGCCCTGAAGGAGGAGCCTCAGACAGTGCCCG 
AGATGCCCGGCGAGACACCGCCCCTGTCCCCCATCGACATGGAGTCCCAGGAGCGGATCAAGGCGGAGAG 
GAAGCGCATGAGGAACCGCATCGCTGCCTCCAAGTGCCGAAAAAGGAAGCTGGAGAGAATCGCCCGGCTG 
GAGGAAAAAGTGAAAACCTTGAAAGCTCAGAACTCGGAGCTGGCGTCCACGGCCAACATGCTCAGGGAAC 
AGGTGGCACAGCTTAAACAGAAAGTCATGAACCACGTTAACAGTGGGTGCCAACTCATGCTAACGCAGCA 
GTTGCAAACATTTTGAAGAGAGACCGTCGGGGGCTGAGGGGCAACGAAGAAAAAAAATAACACAGAGAGA 
CAGACTTGAGAACTTGACAAGTTGCGACGGAGAGAAAAAAGAAGTGTCCGAGAACTAAAGCCAAGGGTAT 
CCAAGTTGGACTGGGTTCGGTCTGACGGCGCCCCCAGTGTGCACGAGTGGGAAGGACTTGGTCGCGCCCT 
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CCCTTGGCGTGGAGCCAGGGAGCGGCCGCCTGCGGGCTGCCCCGCTTTGCGGACGGGCTGTCCCCGCGCG 
AACGGAACGTTGGACTTTCGTTAACATTGACCAAGAACTGCATGGACCTAACATTCGATCTCATTCAGTA 
TTAAAGGGGGGAGGGGGAGGGGGTTACAAACTGCAATAGAGACTGTAGATTGCTTCTGTAGTACTCCTTA 
AGAACACAAAGCGGGGGGAGGGTTGGGGAGGGGCGGCAGGAGGGAGGTTTGTGAGAGCGAGGCTGAGCCT 
ACAGATGAACTCTTTCTGGCCTGCTTTCGTTAACTGTGTATGTACATATATATATTTTTTAATTTGATTA 
AAGCTGATTACTGTCAATAAACAGCTTCATGCCTTTGTAAGTTATTTCTTGTTTGTTTGTTTGGGTATCC 
TGCCCAGTGTTGTTTGTAAATAAGAGATTTGGAGCACTCTGAGTTTACCATTTGTAATAAAGTATATAAT 
TTTTTTATGTTTTGTTTCTGAAAATTCCAGAAAGGATATTTAAGAARATACAATAAACTATTGGAAAGTA 
CTCCCCTAACCTCTTTTCTGCATCATCTGTAGATCCTAGTCTATCTAGGTGGAGTTGAAAGAGTTAAGAA 
TGCTCGATAAAATCACTCTCAGTGCTTCTTACTATTAAGCAGTAAAAACTGTTCTCTATTAGACTTAGAA 
ATAAATGTACCTGATGTACCTGATGCTATGTCAGGCTTCATACTCCACGCTCCCCCAGCGTATCTATATG 
GAATTGCTTACCAAAGGCTAGTGCGATGTTTCAGGAGGCTGGAGGAAGGGGGGTTGCAGTGGAGAGGGAC 
AGCCCACTGAGAAGTCAAACATTTCAAAGTTTGGATTGCATCAAGTGGCATGTGCTGTGACCATTTATAA 
TGTTAGAAATTTTACAATAGGTGCTTATTCTCAAAGCAGGAATTGGTGGCAGATTTTACAAAAGATGTAT 
CCTTCCAATTTGGAATCTTCTCTTTGACAATTCCTAGATAAAAAGATGGCCTTTGTCTTATGAATATTTA 
TAACAGCATTCTGTCACAATAAATGTATTCAAATACCAATAACAGATCTTGAATTGCTTCCCTTTACTAC 
TTTTTTGTTCCCAAGTTATATACTGAAGTTTTTATTTTTAGTTGCTGAGGTT (SEQ ID NO: 6695) 



gi| 17 9982 | gb | M5772 9.ll HOMCCC5 Human complement component C5 mRNA, complete 
cds 

CTACCTCCAACCATGGGCCTTTTGGGAATACTTTGTTTTTTAATCTTCCTGGGGAAAACCTGGGGACAGG 
AGCAAACATATGTCATTTCAGCACCAAAAATATTCCGTGTTGGAGCATCTGAAAATATTGTGATTCAAGT 
TTATGGATACACTGAAGCATTTGATGCAACAATCTCTATTAAAAGTTATCCTGATAAAAAATTTAGTTAC 
TCCTCAGGCCATGTTCATTTATCCTCAGAGAATAAATTCCAAAACTCTGCAATCTTAACAATACAACCAA 
AACAATTGCCTGGAGGACAAAACCCAGTTTCTTATGTGTATTTGGAAGTTGTATCAAAGCATTTTTCAAA 
ATCAAAAAGAATGCCAATAACCTATGACAATGGATTTCTCTTCATTCATACAGACAAACCTGTTTATACT 
CCAGACCAGTCAGTAAAAGTTAGAGTTTATTCGTTGAATGACGACTTGAAGCCAGCCAAAAGAGAAACTG 
TCTTAACCTTCATAGATCCTGAAGGATCAGAAGTTGACATGGTAGAAGAAATTGATCATATTGGAATTAT 
CTCTTTTCCTGACTTCAAGATTCCGTCTAATCCTAGATATGGTATGTGGACGATCAAGGCTAAATATAAA 
GAGGACTTTTCAACAACTGGAACCGCATATTTTGAAGTTAAAGAATATGTCTTGCCACATTTTTCTGTCT 
CAATCGAGCCAGAATATAATTTCATTGGTTACAAGAACTTTAAGAATTTTGAAATTACTATAAAAGCAAG 
ATATTTTTATAATAAAGTAGTCACTGAGGCTGACGTTTATATCACATTTGGAATAAGAGAAGACTTAAAA 
GATGATCAAAAAGAAATGATGCAAACAGCAATGCAAAACACAATGTTGATAAATGGAATTGCTCAAGTCA 
CATTTGATTCTGAAACAGCAGTCAAAGAACTGTCATACTACAGTTTAGAAGATTTAAACAACAAGTACCT 
TTATATTGCTGTAACAGTCATAGAGTCTACAGGTGGATTTTCTGAAGAGGCAGAAATACCTGGCATCAAA 
TATGTCCTCTCTCCCTACAAACTGAATTTGGTTGCTACTCCTCTTTTCCTGAAGCCTGGGATTCCATATC 
CCATCAAGGTGCAGGTTAAAGATTCGCTTGACCAGTTGGTAGGAGGAGTCCCAGTAATACTGAATGCACA 
AACAATTGATGTAAACCAAGAGACATCTGACTTGGATCCAAGCAAAAGTGTAACACGTGTTGATGATGGA 
GTAGCTTCCTTTGTGCTTAATCTCCCATCTGGAGTGACGGTGCTGGAGTTTAATGTCAAAACTGATGCTC 
CAGATCTTCCAGAAGAAAATCAGGCCAGGGAAGGTTACCGAGCAATAGCATACTCATCTCTCAGCCAAAG 
TTACCTTTATATTGATTGGACTGATAACCATAAGGCTTTGCTAGTGGGAGAACATCTGAATATTATTGTT 
ACCCCCAAAAGCCCATATATTGACAAAATAACTCACTATAATTACTTGATTTTATCCAAGGGCAAAATTA 
TCCATTTTGGCACGAGGGAGAAATTTTCAGATGCATCTTATCAAAGTATAAACATTCCAGTAACACAGAA 
CATGGTTCCTTCATCCCGACTTCTGGTCTATTATATCGTCACAGGAGAACAGACAGCAGAATTAGTGTCT 
GATTCAGTCTGGTTAAATATTGAAGAAAAATGTGGCAACCAGCTCCAGGTTCATCTGTCTCCTGATGCAG 
ATGCATATTCTCCAGGCCAAACTGTGTCTCTTAATATGGCAACTGGAATGGATTCCTGGGTGGCATTAGC 
AGCAGTGGACAGTGCTGTGTATGGAGTCCAAAGAGGAGCCAAAAAGCCCTTGGAAAGAGTATTTCAATTC 
TTAGAGAAGAGTGATCTGGGCTGTGGGGCAGGTGGTGGCCTCAACAATGCCAATGTGTTCCACCTAGCTG 
GACTTACCTTCCTCACTAATGCAAATGCAGATGACTCCCAAGAAAATGATGAACCTTGTAAAGAAATTCT 
CAGGCCAAGAAGAACGCTGCAAAAGAAGATAGAAGAAATAGCTGCTAAATATAAACATTCAGTAGTGAAG 
AAATGTTGTTACGATGGAGCCTGCGTTAATAATGATGAAACCTGTGAGCAGCGAGCTGCACGGATTAGTT 
TAGGGCCAAGATGCATCAAAGCTTTCACTGAATGTTGTGTCGTCGCAAGCCAGCTCCGTGCTAATATCTC 
TCATAAAGACATGCAATTGGGAAGGCTACACATGAAGACCCTGTTACCAGTAAGCAAGCCAGAAATTCGG 
AGTTATTTTCCAGAAAGCTGGTTGTGGGAAGTTCATCTTGTTCCCAGAAGAAAACAGTTGCAGTTTGCCC 

353 



WO 2004/080406 



PCT/US2004/007070 



TACCTGATTCTCTAACCACCTGGGAAATTCAAGGCATTGGCATTTCAAACACTGGTATATGTGTTGCTGA 
TACTGTCAAGGCAAAGGTGTTCAAAGATGTCTTCCTGGAAATGAATATACCATATTCTGTTGTACGAGGA 
GAACAGATCCAATTGAAAGGAACTGTTTACAACTATAGGACTTCTGGGATGCAGTTCTGTGTTAAAATGT 
CTGCTGTGGAGGGAATCTGCACTTCGGAAAGCCCAGTCATTGATCATCAGGGCACAAAGTCCTCCAAATG 
5 TGTGCGCCAGAAAGTAGAGGGCTCCTCCAGTCACTTGGTGACATTCACTGTGCTTCCTCTGGAAATTGGC 
CTTCACAACATCAATTTTTCACTGGAGACTTGGTTTGGAAAAGAAATCTTAGTAAAAACATTACGAGTGG 
TGCCAGAAGGTGTCAAAAGGGAAAGCTATTCTGGTGTTACTTTGGATCCTAGGGGTATTTATGGTACCAT 
TAGCAGACGAAAGGAGTTCCCATACAGGATACCCTTAGATTTGGTCCCCAAAACAGAAATCAAAAGGATT 
TTGAGTGTAAAAGGACTGCTTGTAGGTGAGATCTTGTCTGCAGTTCTAAGTCAGGAAGGCATCAATATCC 
10 TAACCCACCTCCCCAAAGGGAGTGCAGAGGCGGAGCTGATGAGCGTTGTCCCAGTATTCTATGTTTTTCA 
CTACCTGGAAACAGGAAATCATTGGAACATTTTTCATTCTGACCCATTAATTGAAAAGCAGAAACTGAAG 
AAAAAATTAAAAGAAGGGATGTTGAGCATTATGTCCTACAGAAATGCTGACTACTCTTACAGTGTGTGGA 
AGGGTGGAAGTGCTAGCACTTGGTTAACAGCTTTTGCTTTAAGAGTACTTGGACAAGTAAATAAATACGT 
AGAGCAGAACCAAAATTCAATTTGTAATTCTTTATTGTGGCTAGTTGAGAATTATCAATTAGATAATGGA 
15 TCTTTCAAGGAAAATTCACAGTATCAACCAATAAAATTACAGGGTACCTTGCCTGTTGAAGCCCGAGAGA 
ACAGCTTATATCTTACAGCCTTTACTGTGATTGGAAT TAGAAAGGCTTTCGATATATGCCCCCTGGTGAA 
AATCGACACAGCTCTAATTAAAGCTGACAACTTTCTGCTTGAAAATACACTGCCAGCCCAGAGCACCTTT 
ACATTGGCCATTTCTGCGTATGCTCTTTCCCTGGGAGATAAAACTCACCCACAGTTTCGTTCAATTGTTT 
CAGCTTTGAAGAGAGAAGCTTTGGTTAAAGGTAATCCACCCATTTATCGTTTTTGGAAAGACAATCTTCA 
20 GCATAAAGACAGCTCTGTACCTAACACTGGTACGGCACGTATGGTAGAAACAACTGCCTATGCTTTACTC 
ACCAGTCTGAACTTGAAAGATATAAATTATGTTAACCCAGTCATCAAATGGCTATCAGAAGAGCAGAGGT 
ATGGAGGTGGCTTTTATTCAACCCAGGACACCATCAATGCCATTGAGGGCCTGACGGAATATTCACTCCT 
GGTTAAACAACTCCGCTTGAGTATGGACATCGATGTTTCTTACAAGCATAAAGGTGCCTTACATAATTAT 
AAAATGACAGACAAGAATTTCCTTGGGAGGCCAGTAGAGGTGCTTCTCAATGATGACCTCATTGTCAGTA 
25 CAGGATTTGGCAGTGGCTTGGCTACAGTACATGTAACAACTGTAGTTCACAAAACCAGTACCTCTGAGGA 
AGTTTGCAGCTTTTATTTGAAAATCGATACTCAGGATATTGAAGCATCCCACTACAGAGGCTACGGAAAC 
TCTGATTACAAACGCATAGTAGCATGTGCCAGCTACAAGCCCAGCAGGGAAGAATCATCATCTGGATCCT 
CTCATGCGGTGATGGACATCTCCTTGCCTACTGGAATCAGTGCAAATGAAGAAGACTTAAAAGCCCTTGT 
GGAAGGGGTGGATCAACTATTCACTGATTACCAAATCAAAGATGGACATGTTATTCTGCAACTGAATTGG 
30 ATTCCCTCCAGTGATTTCCTTTGTGTACGATTCCGGATATTTGAACTCTTTGAAGTTGGGTTTCTCAGTC 
CTGCCACTTTCACAGTTTACGAATACCACAGACCAGATAAACAGTGTACCATGTTTTATAGCACTTCCAA 
TATCAAAATTCAGAAAGTCTGTGAAGGAGCCGCGTGCAAGTGTGTAGAAGCTGATTGTGGGCAAATGCAG 
GAAGAATTGGATCTGACAATCTCTGCAGAGACAAGAAAACAAACAGCATGTAAACCAGAGATTGCATATG 
CTTATAAAGTTAGCATCACATCCATCACTGTAGAAAATGTTTTTGTCAAGTACAAGGCAACCCTTCTGGA 
35 TATCTACAAAACTGGGGAAGCTGTTGCTGAGAAAGACTCTGAGATTACCTTCATTAAAAAGGTAACCTGT 
ACTAACGCTGAGCTGGTAAAAGGAAGACAGTACTTAATTATGGGTAAAGAAGCCCTCCAGATAAAATACA 
ATTTCAGTTTCAGGTACATCTACCCTTTAGATTCCTTGACCTGGATTGAATACTGGCCTAGAGACACAAC 
ATGTTCATCGTGTCAAGCATTTTTAGCTAATTTAGATGAATTTGCCGAAGATATCTTTTTAAATGGATGC 
TAAAATTCCTGAAGTTCAGCTGCATACAGTTTGCACTTATGGACTCCTGTTGTTGAAGTTCGTTTTTTTG 
40 TTTTCTTCTTTTTTTAAACATTCATAGCTGGTCTTATTTGTAAAGCTCACTTTACTTAGAATTAGTGGCA 
CTTGCTTTTATTAGAGAATGATTTCAAATGCTGTAACTTTCTGAAATAACATGGCCTTGGAGGGCATGAA 
GACAGATACTCCTCCAAGGTTATTGGACACCGGAAACAATAAATTGGAACACCTCCTCAAACCTACCACT 
CAGGAATGTTTGCTGGGGCCGAAAGAACAGTCCATTGAAAGGGAGTATTACAAAAACATGGCCTTTGCTT 
GAAAGAAAATACCAAGGAACAGGAAACTGATCATTAAAGCCTGAGTTTGCTTTC (SEQ ID NO: 6696) 

45 



gi 1 189944 | gb|L05144 . 1 IHUMPHOCAR Homo sapiens (clone lamda-hPEC-3) 
phosphoenolpyruvate carboxykinase (PCK1) mRNA, complete cds 
TGGGAACACAAACTTGCTGGCGGGAAGAGCCCGGAAAGAAACCTGTGGATCTCCCTTCGAGATCATCCAA 

50 AGAGAAGAAAGGTGACCTCACATTCGTGCCCCTTAGCAGCACTCTGCAGAAATGCCTCCTCAGCTGCAAA 
ACGGCCTGAACCTCTCGGCCAAAGTTGTCCAGGGAAGCCTGGACAGCCTGCCCCAGGCAGTGAGGGAGTT 
TCTCGAGAATAACGCTGAGCTGTGTCAGCCTGATCACATCCACATCTGTGACGGCTCTGAGGAGGAGAAT 
GGGCGGCTTCTGGGCCAGATGGAGGAAGAGGGCATCCTCAGGCGGCTGAAGAAGTATGACAACTGCTGGT 
TGGCTCTCACTGACCCCAGGGATGTGGCCAGGATCGAAAGCAAGACGGTTATCGTCACCCAAGAGCAAAG 

55 AGACACAGTGCCCATCCCCAAAACAGGCCTCAGCCAGCTCGGTCGCTGGATGTCAGAGGAGGATTTTGAG 
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AAAGCGTTCAATGCCAGGTTCCCAGGGTGCATGAAAGGTCGCACCATGTACGTCATCCCATTCAGCATGG 
GGCCGCTGGGCTCACCTCTGTCGAAGATCGGCATCGAGCTGACGGATTCGCCCTACGTGGTGGCCAGCAT 
GCGGATCATGACGCGGATGGGCACGCCCGTCCTGGAAGCACTGGGCGATGGGGAGTTTGTCAAATGCCTC 
CATTCTGTGGGGTGCCCTCTGCCTTTACAAAAGCCTTTGGTCAACAACTGGCCCTGCAACCCGGAGCTGA 
5 CGCTCATCGCCCACCTGCCTGACCGCAGAGAGATCATCTCCTTTGGCAGTGGGTACGGCGGGAACTCGCT 
GCTCGGGAAGAAGTGCTTTGCTCTCAGGATGGCCAGCCGGCTGGCAGAGGAGGAAGGGTGGCTGGCAGAG 
CACATGCTGATTCXGGGTATAACCAACCCTGAGGGTGAGAAGAAGTACCTGGCGGCCGCATTTCCCAGCG 
CCTGCGGGAAGACCAACCTGGCCATGATGAACCCCAGCCTCCCCGGGTGGAAGGTTGAGTGCGTCGGGGA 
TGACATTGCCTGGATGAAGTTTGACGCACAAGGTCATTTAAGGGCCATCAACCCAGAAAATGGCTTTTTC 

10 GGTGTCGCTCCTGGGACTTCAGTGAAGACCAACCCCAATGCCATCAAGACCATCCAGAAGAACACAATCT 
TTACCAATGTGGCCGAGACCAGCGACGGGGGCGTTTACTGGGAAGGCATTGATGAGCCGCTAGCTTCAGG 
CGTCACCATCACGTCCTGGAAGAATAAGGAGTGGAGCTCAGAGGATGGGGAACCTTGTGCCCACCCCAAC 
TCGAGGTTCTGCACCCCTGCCAGCCAGTGCCCCATCATTGATGCTGCCTGGGAGTCTCCGGAAGGTGTTC 
CCATTGAAGGCATTATCTTTGGAGGCCGTAGACCTGCTGGTGTCCCTCTAGTCTATGAAGCTCTCAGCTG 

15 GCAACATGGAGTCTTTGTGGGGGCGGCCATGAGATCAGAGGCCACAGCGGCTGCAGAACATAAAGGCAAA 
ATCATCATGCATGACCCCTTTGCCATGCGGCCCTTCTTTGGCTACAACTTCGGCAAATACCTGGCCCACT 
GGCTTAGCATGGCCCAGCACCCAGCAGCCAAACTGCCCAAGATCTTCCATGTCAACTGGTTCCGGAAGGA 
CAAGGAAGGCAAATTCCTCTGGCCAGGCTTTGGAGAGAACTCCAGGGTGCTGGAGTGGATGTTCAACCGG 
ATCGATGGAAAAGCCAGCACCAACGTCACGCCCATAGGCTACATCCCCAAGGAGGATGCCCTGAACCTGA 

20 AAGGCCTGGGGCACATCAACATGATGGAGCTTTTCAGCATCTCCAAGGAATTCTGGGACAAGGAGGTGGA 
AGACATCGAGAAGTATCTGGTGGATCAAGTCAATGCCGACCTCCCCTGTGAAATCGAGAGAGAGATCCTT 
GCCTTGAAGCAAAGAATAAGCCAGATGTAATCAGGGCCTGAGAATAAGCCAGATGTAATCAGGGCCTGAG 
TGCTTTACCTTTAAAATCATTAAATTAAAATCCATAAGGTGCAGTAGGAGCAAGAGAGGGCAAGTGTTCC 
CAAATTGACGCCACCTAATAATCATCACCACACCGGGAGCAGATCTGAAGGCACACTTTGATTTTTTTAA 

25 GGATAAGAACCACAGAACACTGGGTAGTAGCTAATGAAATTGAGAAGGGAAATCTTAGCATGCCTCCAAA 
AATTCACATCCAATGCATACTTTGTTCAAATTTAAGGTTACTCAGGCATTGATCTTTTCAGTGTTTTTTC 
ACTTAGCTATGTGGATTAGCTAGAATGCACACCAAAAAGATACTTGAGCTGTATATATATATGTGTGTGT 
GTGTGTGTGTGTGTGTGTGTGTGCATGTATGTGCACATGTGTCTGTGTGATATTTGGTATGTGTATTTGT 
ATGTACTGTTATTCAAAATATATTTAATACCTTTGGAAAATCTTGGGCAAGATGACCTACTAGTTTTCCT 

30 TGAAAAAAAGTTGCTTTGTTATTAATATTGTGCTTAAATTATTTTTATACACCATTGTTCCTTACCTTTA 
CATAATTGCAATATTTCCCCCTTACTACTTCTTGGAAAAAAATTAGAAAATGAAGTTTATAGAAAAG 
(SEQ ID NO: 6697) 



35 gi | 6679892 | ref |NM_008061.1 | Mus musculus glucose-6-phosphatase, catalytic 
(G6pc), mRNA 

AGCAGAGGGATCGGGGCCAACCGGGCTTGGACTCACTGCACGGGCTCTGCTGGCAGCTTCCTGAGGTACC 
AAGGGAGGAAGGATGGAGGAAGGAATGAACATTCTCCATGACTTTGGGATCCAGTCGACTCGCTATCTCC 
AAGTGAATTACCAAGACTCCCAGGACTGGTTCATCCTTGTGTCTGTGATTGCTGACCTGAGGAACGCCTT 

40 CTATGTCCTCTTTCCCATCTGGTTCCATCTTAAAGAGACTGTGGGCATCAATCTCCTCTGGGTGGCAGTG 
GTCGGAGACTGGTTCAACCTCGTCTTCAAGTGGATTCTGTTTGGACAACGCCCGTATTGGTGGGTCCTGG 
ACACCGACTACTACAGCAACAGCTCCGTGCCTATAATAAAGCAGTTCCCTGTCACCTGTGAGACCGGACC 
AGGAAGTCCCTCTGGCCATGCCATGGGCGCAGCAGGTGTATACTATGTTATGGTCACTTCTACTCTTGCT 
ATCTTTCGAGGAAAGAAAAAGCCAACGTATGGATTCCGGTGTTTGAACGTCATCTTGTGGTTGGGATTCT 

45 GGGCTGTGCAGCTGAACGTCTGTCTGTCCCGGATCTACCTTGCTGCTCACTTTCCCCACCAGGTCGTGGC 
TGGAGTCTTGTCAGGCATTGCTGTGGCTGAAACTTTCAGCCACATCCGGGGCATCTACAATGCCAGCCTC 
CGGAAGTATTGTCTCATCACCATCTTCTTGTTTGGTTTCGCGCTTGGATTCTACCTGCTACTAAAAGGGC 
TAGGGGTGGACCTCCTGTGGACTTTGGAGAAAGCCAAGAGATGGTGTGAGCGGCCAGAATGGGTCCACCT 
TGACACTACACCCTTTGCCAGCCTCTTCAAAAACCTGGGAACCCTCTTGGGGTTGGGGCTGGCCCTCAAC 

50 TCCAGCATGTACCGGAAGAGCTGCAAGGGAGAACTCAGCAAGTCGTTCCCATTCCGCTTCGCCTGCATTG 
TGGCTTCCTTGGTCCTCCTGCATCTCTTTGACTCTCTGAAGCCCCCATCCCAGGTTGAGTTGATCTTCTA 
CATCTTGTCTTTCTGCAAGAGCGCAACAGTTCCCTTTGCATCTGTCAGTCTTATCCCATACTGCCTAGCC 
CGGATCCTGGGACAGACACACAAGAAGTCTTTGTAAGGCATGCAGAGTCTTTGGTATTTAAAGTCAACCG 
CCATGCAAAGGACTAGGAACAACTAAAGCCTCTGAAACCCATTGTGAGGCCAGAGGTGTTGACATCGGCC 

55 CTGGTAGCCCTGTCTTTCTTTGCTATCTTAACCAAAAGGTGAATTTTTACAAAGCTTACAGGGCTGTTTG 
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AGGAAAGTGTGAATGCTGGAAACTGAGTCATTCTGGATGGTTCCCTGAAGATTCGCTTACCAGCCTCCTG 
TCAGATACAGAAGAGCAAGCCCAGGCTAGAGATCCCAACTGAGAATGCTCTTGCGGTGCAGAATCTTCCG 
GCTGGGAAAAGGAAAAGAGCACCATGCATTTGCCAGGAAGAGAAAGAAGGATCGGGAGGAGGGAGAGTGT 
TTTATGTATCGAGCAAACCAGATGCAATCTATGTCTAACCGGCTTCAGTTGTGTCTGCGTCTTTAGATAC 
5 GACACACTCAATAATAATAATAGACCAACTAGTGTAATGAGTAGCCAGTTAAAGGCGATTAATTCTGCTT 
CCAGATAGTCTCCACTGTACATAAAAGTCACACTGTGTGCTTGCATTCCTGTATGGTAGTGGTGACTGTC 
TCTCACACCACCTTCTCTATCACGTCACAGTTTTCTCCTCCTCAGCCTATGTCTGCATTCCCCAGAATTC 
TCCACTTGTTCCCTGGCCCTGCTGCTGGACCCTGCTGTGTCTGGTAGGCAACTGTTTGTTGGTGCTTTTG 
TAGGGTTAAGTTAAACTCTGAGATCTTGGGCAAAATGGCAAGGAGACCCAGGATTCTTCTCTCCAAAGGT 
10 CACTCCGATGTTATTTTTGATTCCTGGGGCAGAAATATGACTCCTTTCCCTAGCCCAAGCCAGCCAAGAG 
CTCTCATTCTTAGAAGAAAAGGCAGCCCCTTGGTGCCTGTCCTCCTGCCTCGGCTGATTTGCAGAGTACT 
TCTTCAAAAAGAAAAAAATGGTAAAGCTATTTATTAAAAATTCTTTGTTTTTTGCTACAAATGATGCATA 
TATTTTCACCCACACCAAGCACTTTGTTTCTAATATCTTTGATAAGAAAACTACATGTGCAGTATTTTAT 
TAAAGCAAC AT TTTATTTA (SEQ ID NO: 6698) 

15 



gi | 7110682 | ref | NM_011044 . 1 | Mus musculus phosphoenolpyruvate carboxykinase 
1, cytosolic (Pckl), mRNA 

ACAGTTGGCCTTCCCTCTGGGAACACACCCTCGGTCAACAGGGGAAATCCGGCAAGGCGCTCAGCGATCT 

20 CTGATCCAGACCTTCCAAAAGGAAGAAAGGTGGCACCAGAGTTCCTGCCTCTCTCCACACCATTGCAATT 
ATGCCTCCTCAGCTGCATAACGGTCTGGACTTCTCTGCCAAGGTTATCCAGGGCAGCCTCGACAGCCTGC 
CCCAGGCAGTGAGGAAGTTCGTGGAAGGCAATGCTCAGCTGTGCCAGCCGGAGTATATCCACATCTGCGA 
TGGCTCCGAGGAGGAGTACGGGCAGTTGCTGGCCCACATGCAGGAGGAGGGTGTCATCCGCAAGCTGAAG 
AAATATGAGAACTGTTGGCTGGCTCTCACTGACCCTCGAGATGTGGCCAGGATCGAAAGCAAGACAGTCA 

25 TCATCACCCAAGAGCAGAGAGACACAGTGCCCATCCCCAAAACTGGCCTCAGCCAGCTGGGCCGCTGGAT 
GTCGGAAGAGGACTTTGAGAAAGCATTCAACGCCAGGTTCCCAGGGTGCATGAAAGGCCGCACCATGTAT 
GTCATCCCATTCAGCATGGGGCCACTGGGCTCGCCGCTGGCCAAGATTGGTATTGAACTGACAGACTCGC 
CCTATGTGGTGGCCAGCATGCGGATCATGACTCGGATGGGCATATCTGTGCTGGAGGCCCTGGGAGATGG 
GGAGTTCATCAAGTGCCTGCACTCTGTGGGGTGCCCTCTCCCCTTAAAAAAGCCTTTGGTCAACAACTGG 

30 GCCTGCAACCCTGAGCTGACCCTGATCGCCCACCTCCCGGACCGCAGAGAGATCATCTCCTTTGGAAGCG 
GATATGGTGGGAACTCACTACTCGGGAAGAAATGCTTTGCGTTGCGGATCGCCAGCCGTCTGGCTAAGGA 
GGAAGGGTGGCTGGCGGAGCATATGCTGATCCTGGGCATAACTAACCCCGAAGGCAAGAAGAAATACCTG 
GCCGCAGCCTTCCCTAGTGCCTGTGGGAAGACTAACTTGGCCATGATGAACCCCAGCCTGCCCGGGTGGA 
AGGTCGAATGTGTGGGCGATGACATTGCCTGGATGAAGTTTGATGCCCAAGGCAACTTAAGGGCTATCAA 

35 CCCAGAAAACGGGTTTTTTGGAGTTGCTCCTGGCACCTCAGTGAAGACAAATCCAAATGCCATTAAAACC 
ATCCAGAAAAACACCATCTTCACCAACGTGGCCGAGACTAGCGATGGGGGTGTTTACTGGGAAGGCATCG 
ATGAGCCGCTGGCCCCGGGAGTCACCATCACCTCCTGGAAGAACAAGGAGTGGAGACCGCAGGACGCGGA 
ACCATGTGCCCATCCCAACTCGAGATTCTGCACCCCTGCCAGCCAGTGCCCCATTATTGACCCTGCCTGG 
GAATCTCCAGAAGGAGTACCCATTGAGGGTATCATCTTTGGTGGCCGTAGACCTGAAGGTGTCCCCCTTG 

40 TCTATGAAGCCCTCAGCTGGCAGCATGGGGTGTTTGTAGGAGCAGCCATGAGATCTGAGGCCACAGCTGC 
TGCAGAACACAAGGGCAAGATCATCATGCACGACCCCTTTGCCATGCGACCCTTCTTCGGCTACAACTTC 
GGCAAATACCTGGCCCACTGGCTGAGCATGGCCCACCGCCCAGCAGCCAAGTTGCCCAAGATCTTCCATG 
TCAACTGGTTCCGGAAGGACAAAGATGGCAAGTTCCTCTGGCCAGGCTTTGGCGAGAACTCCCGGGTGCT 
GGAGTGGATGTTCGGGCGGATTGAAGGGGAAGACAGCGCCAAGCTCACGCCCATCGGCTACATCCCTAAG 

45 GAAAACGCCTTGAACCTGAAAGGCCTGGGGGGCGTCAACGTGGAGGAGCTGTTTGGGATCTCTAAGGAGT 
TCTGGGAGAAGGAGGTGGAGGAGATCGACAGGTATCTGGAGGACCAGGTCAACACCGACCTCCCTTACGA 
AATTGAGAGGGAGCTCCGAGCCCTGAAACAGAGAATCAGCCAGATGTAAATCCCAATGGGGGCGTCTCGA 
GAGTCACCCCTTCCCACTCACAGCATCGCTGAGATCTAGGAGAAAGCCAGCCTGCTCCAGCTTTGAGATA 
GCGGCACAATCGTGAGTAGATCAGAAAAGCACCTTTTAATAGTCAGTTGAGTAGCACAGAGAACAGGCTA 

50 GGGGCAAATAAGATTGGGAGGGGAAATCACCGCATAGTCTCTGAAGTTTGCATTTGACACCAATGGGGGT 
TTTGGTTCCACTTCAAGGTCACTCAGGAATCCAGTTCTTCACGTTAGCTGTAGCAGTTAGCTAAAATGCA 
CAGAAAACATACTTGAGCTGTATATATGTGTGTGAACGTGTCTCTGTGTGAGCATGTGTGTGTGTGTGTG 
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTACATGCCTGTCTGTCCCATTGTCCACAGTATATTTAA 
AACCTTTGGGGAAAAATCTTGGGCAAATTTGTAGCTGTAACTAGAGAGTCATGTTGCTTTGTTGCTAGTA 

55 TGTATGTTTAAATTATTTTTATACACCGCCCTTACCTTTCTTTACATAATTGAAATTGGTATCCGGACCA 
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Example 6 siRNAs decrease mRNA levels in vivo 
Male CMV-Luc mice (8-10 weeks old) from Xenogen (Cranbury, NJ) were 
administered cholesterol conjugated siRNA (see Table 16). 



5 Table 16. Solutions adminstered to mice 



Group 


a 


Injection Mix 


1 


7 


Buffer (PBS [pH 7.4]) 


2 


8 


Cholesterol conjugated siRNA 
(ALN-3001) 



Table 17. Test iRNA agents targeting Luciferase 



siRNA 


Sequence 


ALN-1070 


5'-GAA COG UGU GUG AGA GGU CCU-3' (SEQ ID NO: 6700) 
3'-CG CUU GAC ACA CAC UCO CCA GGA-5' (SEQ ID NO: 6701) 


ALN-1000 


5'-GAA CDG UGU GUG AGA GGU CCU-GS-3' (SEQ ID NO: 6702) 
3'-CG CUU GAC ACA CAC UCU CCA GGA-5' (SEQ ID NO: 6703) 


ALN-3000 


5'-GAA CUG UGU GUG AGA GGU CCU-3' (SEQ ID NO: 6704) 
S'-Cs^-Gs 1 CUU GAC ACA CAC UCU CCA GGA-5' (SEQ ID NO: 6705) 


ALN-3001 


5'-GAA CUG UGU GUG AGA GGU CCU-chol . 2 -3' (SEQ ID NO: 6706) 
S'-Cs^s 1 CUU GAC ACA CAC UCU CCA GGA-5' (SEQ ID NO: 6707) 



1 2' O-Me group is attached to the nucleotide and the nucleotides have phosphorothioate linkages 
10 (indicated by "s") 

2 cholesterol is conjugated to the antisense strand via the linker: U-pyrroline carrier-C(0)-(CH 2 ) 5 - 
NHC(0)-cholesterol (via cholesterol C-3 hydroxyl). 

Animals were injected (tail vein) with a volume of 200-250 ul test solution containing 
15 buffer or an siRNA solution. Group 1 received buffer and group 2 received cholesterol 

conjugated siRNA (ALN-3001) at a dose of 50 mg/kg body weight. Twenty-two hours after 
injection, animals were sacrificed and livers collected. Organs were snap frozen on dry ice, 
then pulverized in a mortar and pestle. 

For Luciferase mRNA analysis (by the QuantiGene Assay (Genospectra, Inc.; 
20 Fremont, CA)), approximately 10 mg of tissue powder was resuspended in tissue lysis buffer, 
and processed according to the manufacturer's protocol. Samples of the lysate were 
hybridized with probes specific for Luciferase or GAPDH (designed using ProbeDesigner 
software (Genospectra, Inc., Fremont, CA) in triplicate, and processed for luminometric 
analysis. Values for Luciferase were normalized to GAPDH. Mean values were plotted with 
25 error bars corresponding to the standard deviation of the Luciferase measurements. 
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Results indicated that the level of luciferase RNA in animals injected with cholesterol 
conjugated siRNA was reduced by about 70% as compared to animals injected with buffer 
(see FIGs 6A and 6b). 

5 In Vitro Activity 

HeLa cells expressing luciferase were transfected with each of the siRNAs listed in 
Table 17. ALN-1000 siRNAs were most effective at decreasing luciferase mRNA levels 
(-0.6 nM siRNA decreased mRNA levels to about -65% the original expression level, and 
1.0 nM siRNA decreased levels to about -20% the original expression level); ALN-3001 

10 siRNAs were least effective (-0.6 nM siRNA had a negligible mRNA levels, and 1 .0 nM 
siRNA decreased levels to about -40% the original expression level). 

Pharmacokinetics/Biodistribution 

Pharmacokinetic analyses were performed in mice and rats. Test siRNA molecules 
15 were radioactively labeled with 33 P on the antisense strand by splint ligation. Labeled 
siRNAs (50mg/kg) were administered by tail vein injection, and plasma levels of siRNA 
were measured periodically over 24 hrs by scintillation counting. Cholesterol conjugated 
siRNA (ALN-3001) was discovered to circulate in mouse plasma for a longer period time 
than unconjugated siRNA (ALN-3000) (FIG. 7). RNAse protection assays indicated that 
20 cholesterol-conjugated siRNA (ALN-3001) was detectable in mouse plasma 12 hours after 
injection, whereas unconjugated siRNA (ALN-3000) was not detectable in mouse plasma 
within two hours following injection. Similar results were observed in rats. 

Mouse liver was harvested at varying time points (ranging from 0.08-24 hours) 
following injection with siRNA, and siRNA localized to the liver was quantified. Over the 
25 time period tested, the amount of cholesterol-conjugated siRNA (ALN-3001) detected in the 
liver ranged from 14.3-3.55 percent of the total dose administered to the mouse. The amount 
of unconjugated siRNA (ALN-3000) detected in the liver was lower, ranging from 3.91- 
1.75 percent of the total dose administered. 
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Detection of siRNA in Different Tissues 

Various tissues and organs (fat, heart, kidney, liver, and spleen) were harvested from 
two CMV-Luc mice 22 hours following injection with 50 mg/kg ALN-3001 . The antisense 
strand of the siRNA was detected by RNAse protection assay. The liver contained the 
greatest concentration of siRNA (-8-10 ug siRNA/g tissue); the spleen, heart and kidney 
contained lesser amounts of siRNA (-2-7 ug siRNA/g tissue); and fat tissue contained the 
least amount of siRNA (<~1 ug siRNA/g tissue). 

Glucose-6-phosphatase siRNA detection by RNAse Protection Assay 
Balbc mice were injected with U/U, 3'C/U, or 3' C/3' C siRNA (4 mg/kg) targeting 
glucose-6-phosphatase (G6Pase) (see Table 18). Administration was by hydrodynamic tail 
vein injection (hd) or non-hydrodynamic tail vein injection (iv), and siRNA was 
subsequently detected in the liver by RNAse protection assay. 



Table 18. Test iRNA agents targeting glucose-6-phosphatase 



siRNA 


Description 


U/U 


No cholesterol; dinucleotide 3' overhangs on sense and antisense strands 


3'C/U 


dinucleotide 3 ' overhangs on sense and antisense strands; cholesterol 
conjugated to 3' end of sense strand (mono-conjugate) 


3'C/3'C 


dinucleotide 3' overhangs on sense and antisense strands; cholesterol 
conjugated to 3' end of both sense and antisense strands (bis-conjugate) 



Unconjugated siRNA (U/U) delivered by hd was detected by 15 min. post-injection 
(the earliest determined time-point) and was still detectable in the liver 18 hours post- 
injection. 

Delivery by normal iv administration resulted in the greatest concentration of 3 'C/3 'C 
siRNA (the bis-cholesterol-conjugate) in the liver 1 hour post injection (as compared to the 
mono-cholesterol-conjugate 3'C/3'U siRNA). At 18 hours post injection, 3'C/3'C siRNAs 
and 3 'C/U siRNA were still detectable in the liver with the bis-conjugate at higher levels 
compared to the mono-conjugate. 

While this invention has been particularly shown and described with reference to 
preferred embodiments thereof, it will be understood by those skilled in the ait that various 
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changes in form and details may be made therein without departing from the scope of the 
invention encompassed by the appended claims. 
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WHAT IS CLAIMED IS: 

1 . An iRNA agent comprising a sense sequence and an antisense sequence, wherein 
5 the sense sequence has one or more asymmetrical 2'-0 alkyl modifications and the antisense 

sequence has one or more asymmetrical phosphorothioate modifications, and the antisense 
sequence targets a human gene sequence. 

2. The iRNA agent of claim 1, wherein at least one of said 2'-0-alkyl modifications 
10 is a 2 ' -OMe modification. 

3. The iRNA agent of claim 1, wherein the sense sequence has at least 2 
asymmetrical 2'-0 alkyl modifications. 

15 4. The iRNA agent of claim 1, wherein the sense has at least 4 asymmetrical 2'-0 

alkyl modifications. ^ 

5. The iRNA agent of claim 4, wherein the asymmetrical modifications are 2'-OMe 
modifications. 

20 

6. The iRNA agent of claim 1, wherein the sense sequence has at least 6 
asymmetrical 2'-0 alkyl modifications. 

7. The iRNA agent of claim 6, wherein the asymmetrical modifications are 2'-OMe 
25 modifications. 

8. The iRNA agent of claim 1, wherein the sense sequence has at least 8 
asymmetrical 2'-0 alkyl modifications. 

30 9. The iRNA agent of claim 8, wherein the asymmetrical modifications are 2'-OMe 

modifications. 
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10. The iRNA agent of claim 1 , wherein all of the subunits of the sense sequence 
have an asymmetrical 2'-0 alkyl modification. 

11. The iRNA agent of claim 10, wherein the asymmetrical modifications are 2'-OMe 
modifications. 

12. The iRNA agent of claim 1, wherein the antisense sequence has at least 2 
asymmetrical phosphorothioate modifications. 

13. The iRNA agent of claim 1, wherein the antisense sequence has at least 4 
asymmetrical phosphorothioate modifications. 

14. The iRNA agent of claim 1, wherein the antisense sequence has at least 6 
asymmetrical phosphorothioate modifications. 

1 5 . The iRNA agent of claim 1 , wherein the antisense sequence has at least 8 
asymmetrical phosphorothioate modifications. 

16. The iRNA agent of claim 1, wherein all of the subunits of the sense sequence 
have an asymmetrical phosphorothioate modification. 

17. The iRNA agent of claim 1, wherein the sense and antisense sequences are on 
different RNA strands. 

18. The iRNA agent of claim 1, wherein the sense and antisense sequences are on the 
same RNA strand. 

19. The iRNA agent of claim 1, wherein the sense and antisense sequences are fully 
complementary to each other. 
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20. The iRNA agent of claim 1, further comprising a cholesterol moiety. 

21 . The iRNA agent of claim 20, wherein said cholesterol moiety is coupled to a 
sense strand. 

5 

22. The iRNA agent of claim 20, further comprising a second cholesterol moiety. 

23. The iRNA agent of claim 22, wherein said second cholesterol moiety is coupled 
to a sense strand. 

10 

24. The iRNA agent of claim 1, wherein said human gene is an oncogene. 

25. The iRNA agent of claim 1, wherein said human gene is the apoB-100 gene. 

15 26. The iRNA agent of claim 1 , wherein said human gene is the glucose-6- 

phosphatase gene. 

27. The iRNA agent of claim 1, wherein the said human gene is the beta catenin 

gene. 

20 

28. The iRNA agent of claim 1, wherein the iRNA agent is at least 21 nucleotides in 
length, and the duplex region of the iRNA is about 19 nucleotides in length. 

29. The iRNA agent of claim 1, having a duplex region of about 19 subunits in 
25 length and one or two 3' overhangs of about 2 subunits in length. 

30. A pharmaceutical preparation comprising the iRNA agent of claim 1. 

31. A method for reducing apoB-100 levels in a subject comprising administering to 
30 a subject an iRNA agent comprising a sense strand sequence and an antisense sequence, 

wherein the sense sequence has at least 4 asymmetrical 2'-0 alkyl modifications and the 
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antisense sequence has at least 4 asymmetrical phosphorothioate modifications, and the 
antisense sequence targets apoB-100. 

32. The method of claim 31, wherein the subject is suffering from a disorder 
5 characterized by elevated or otherwise unwanted expression of apoB-1 00, elevated or 

otherwise unwanted levels of cholesterol, and/or disregulation of lipid metabolism. 

33. The method of claim 32, wherein said disorder is chosen form the group of 
HDL/LDL cholesterol imbalance; dyslipidemias; hypercholesterolemia; statin-resistant 

10 hypercholesterolemia; coronary artery disease (CAD) coronary heart disease (CHD) 
atherosclerosis 

34. A method for reducing glucose-6-phosphatase levels in a subject comprising 
administering to a subject an iRNA agent comprising a sense strand sequence and an 

15 antisense sequence, wherein the sense sequence has at least 4 asymmetrical 2'-0 alkyl 
modifications and the antisense sequence has at least 4 asymmetrical phosphorothioate 
modifications, and the antisense sequence targets glucose-6-phosphatase. 

35. The method of claim 34, wherein the iRNA agent is administered to a subject to 
20 inhibit hepatic glucose production, or for the treatment of a glucose-metabolism-related 

disorder. 

36. The method of claim 35, wherein said disorder is diabetes. 

25 37. The method of claim 35, wherein said disorder is type-2 diabetes. 

38. A method of making an iRNA agent, the method comprising: 
providing a sense strand sequence having at least 4 asymmetrical 2'-0 alkyl 
modifications and an antisense sequence having at least 4 asymmetrical phosphorothioate 
30 modifications, and allowing the sense and antisense strand to hybridize. 
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39. A method of stabilizing an iRNA agent, comprising selecting a sequence with 
activity, and introducing one or more asymmetrical modification in said sequence, wherein 
said modification decreases nuclease sensitivity while not decreasing activity. 



366 



WO 2004/080406 



1/10 



PCT/US2004/007070 




WO 2004/080406 



2/10 



PCT/US2004/007070 




WO 2004/080406 



3/10 



PCT/US2004/007070 




WO 2004/080406 



4/10 



PCT/US2004/007070 



W 

0=1 — O" 



(R-2) 




.J* 

PG PG 




FIG. 4 



WO 2004/080406 



5/10 



PCT/US2004/007070 







1 


1 


2 

3 


RiQ, 
TMDO^Q 

0 A^^n^NHR2 

s 


Rl °v-> 

Z 


k - ^v^V^-NHR 2 

s 


S 


pRi j 

TMDO^Q 

12 



FIG. 5 



WO 2004/080406 PCT/US2004/007070 

6/10 



u m m 

QP 00 " Ofv- 

k^VX^WHRB (A^v 1 * 158 k.^s^^NHRs 

Ji lfi 25 

SSI. 

2tt 22 3 



FIG. 5 (Cont'd) 




FIG. 5 (Cont'd) 



WO 2004/080406 PCT/US2004/007070 
8/10 ■ 




FIG. 5 (Cont'd) 



WO 2004/080406 



9/10 



PCT/US2004/007070 




WO 2004/080406 



10/10 



PCT/US2004/007070 




